如果你在运行多个服务实例的应用中设置 pool(目前在大多数云/容器环境中很常见),则需要仔细考虑所有服务以及连接到 Postgres 服务器的所有服务实例的池的 max
参数。
¥If you're using a pool in an application with multiple instances of your service running (common in most cloud/container environments currently), you'll need to think a bit about the max
parameter of your pool across all services and all instances of all services which are connecting to your Postgres server.
根据你的云环境,这可能会变得非常复杂。pg-bouncer、RDS 连接代理等工具带来了进一步的细微差别,它们可以实现某种形式的连接池和连接复用。所以,这绝对值得考虑。我们来运行一些设置。虽然这些示例并非详尽无遗,但希望它们能促使你思考什么最适合你的设置。
¥This can get pretty complex depending on your cloud environment. Further nuance is introduced with things like pg-bouncer, RDS connection proxies, etc., which will do some forms of connection pooling and connection multiplexing. So, it's definitely worth thinking about. Let's run through a few setups. While certainly not exhaustive, these examples hopefully prompt you into thinking about what's right for your setup.
简单的应用、开发模式、固定实例数量等。
¥Simple apps, dev mode, fixed instance counts, etc.
如果你的应用没有运行在支持自动伸缩容器、lambda 表达式或云函数等的 Kubernetes 环境中,你可以对可用的 max
池配置进行一些 "餐巾纸上的数学" 调整。假设你的 Postgres 实例配置为一次最多 200 个连接。你知道你的服务将在 4 个实例上运行。你可以将 max
池大小设置为 50,但如果所有服务都因等待数据库连接而饱和,你将无法从任何管理工具连接到数据库,也无法在不更改配置/代码来调整最大大小的情况下扩展服务。
¥If your app isn't running in a k8s style env with containers scaling automatically or lambdas or cloud functions etc., you can do some "napkin math" for the max
pool config you can use. Let's assume your Postgres instance is configured to have a maximum of 200 connections at any one time. You know your service is going to run on 4 instances. You can set the max
pool size to 50, but if all your services are saturated waiting on database connections, you won't be able to connect to the database from any mgmt tools or scale up your services without changing config/code to adjust the max size.
在这种情况下,我可能会将 max
设置为 20 或 25。这为你提供了足够的空间来扩展更多实例,并且实际上,如果你的应用缺乏数据库连接,你可能需要检查查询并使其执行速度更快,或者使用缓存或其他方法来减轻数据库的负载。我开发过一个报表处理量更大的应用,用户数量有限,但每个应用同时运行 5-6 个查询,所有查询的运行时间都在 100-200 毫秒之间。在这种情况下,我将 max
提升到了 50。不过,通常情况下,我不会费心将其设置为默认值 10
以外的任何值,因为通常情况下这样就可以了。
¥In this situation, I'd probably set the max
to 20 or 25. This lets you have plenty of headroom for scaling more instances and realistically, if your app is starved for db connections, you probably want to take a look at your queries and make them execute faster, or cache, or something else to reduce the load on the database. I worked on a more reporting-heavy application with limited users, but each running 5-6 queries at a time which all took 100-200 milliseconds to run. In that situation, I upped the max
to 50. Typically, though, I don't bother setting it to anything other than the default of 10
as that's usually fine.
自动伸缩、云函数、多租户等。
¥Auto-scaling, cloud-functions, multi-tenancy, etc.
如果连接到数据库的服务实例数量更具动态性,并且取决于负载、自动扩展容器或在云函数中运行等因素,则需要更仔细地考虑最大值。通常在这些环境中,数据库前面会有另一个数据库池代理,例如 pg-bouncer 或 RDS-proxy 等。我不确定所有这些代理的具体功能,它们都有一些权衡,但我们假设你没有使用代理。那么,我会非常谨慎地设置每个池的大小。如果你运行的应用负载非常高,需要动态扩展或大量 lambda 表达式启动并发送查询,那么你的查询速度可能很快,你可以将 max
设置为较低的值(例如 10) - 或者干脆不设置它,因为 10
是默认值。
¥If the number of instances of your services which connect to your database is more dynamic and based on things like load, auto-scaling containers, or running in cloud-functions, you need to be a bit more thoughtful about what your max might be. Often in these environments, there will be another database pooling proxy in front of the database like pg-bouncer or the RDS-proxy, etc. I'm not sure how all these function exactly, and they all have some trade-offs, but let's assume you're not using a proxy. Then I'd be pretty cautious about how large you set any individual pool. If you're running an application under pretty serious load where you need dynamic scaling or lots of lambdas spinning up and sending queries, your queries are likely fast and you should be fine setting the max
to a low value like 10 -- or just leave it alone, since 10
is the default.
pg-bouncer、RDS-proxy 等。
¥pg-bouncer, RDS-proxy, etc.
我不确定 Postgres 的所有池化服务是否都适用。我自己没有用过这些工具。在开发 pg
的这些年里,我解决了由各种代理行为与实际 Postgres 后端不同的问题。事务等方面也存在一些陷阱。另一方面,很多人运行这些代理都取得了成功。在这种情况下,我建议使用一些较小但合理的 max
值,例如 10
的默认值,因为这样仍然有助于保持从你的服务到 Postgres 代理的几个 TCP 套接字处于打开状态。
¥I'm not sure of all the pooling services for Postgres. I haven't used any myself. Throughout the years of working on pg
, I've addressed issues caused by various proxies behaving differently than an actual Postgres backend. There are also gotchas with things like transactions. On the other hand, plenty of people run these with much success. In this situation, I would just recommend using some small but reasonable max
value like the default value of 10
as it can still be helpful to keep a few TCP sockets from your services to the Postgres proxy open.
结论,tl;dr
¥Conclusion, tl;dr
这是一个有点复杂的话题,在你需要开始扩展之前,它不会对事情产生太大影响。到那时,你的连接数可能仍然不会成为你的扩展瓶颈。这值得考虑一下,但大多数情况下,我建议将池大小保留为默认值 10
,直到遇到问题为止:希望你永远不会遇到这种情况!
¥It's a bit of a complicated topic and doesn't have much impact on things until you need to start scaling. At that point, your number of connections still probably won't be your scaling bottleneck. It's worth thinking about a bit, but mostly I'd just leave the pool size to the default of 10
until you run into troubles: hopefully you never do!