1 Million Postgres Connections
Scaling PostgreSQL to handle 1 million concurrent clients is impossible directly, but with proper architecture, pooling, and load balancing, it can be achieved safely.
This article explains a practical setup using:
- PostgreSQL primary and replicas
- PgBouncer connection poolers
- Traefik TCP load balancer
- Docker Swarm orchestration
Server Specificationsโ
For a production-grade setup:
| Component | Count | VM/Host | CPU | RAM | Storage |
|---|---|---|---|---|---|
| PostgreSQL Primary | 1 | VM1 | 16 | 128 GB | NVMe 1 TB RAID-10 |
| PostgreSQL Replica | 2 | VM1 + VM2 | 16 | 128 GB | NVMe 1 TB RAID-10 |
| PgBouncer (Docker Swarm) | 20โ100 | VM3 (Manager) | 8 | 64 GB | Shared network |
| Docker Swarm Manager | 1 | VM3 | 8 | 64 GB | For orchestration |
PostgreSQL sees only the sum of all PgBouncer backend connections, not the 1 million clients.
PostgreSQL Configurationโ
In each PostgreSQL container:
max_connections = 500 # realistic
shared_buffers = 25GB # ~25% of RAM
work_mem = 64MB
maintenance_work_mem = 2GB
Memory calculation:
- Shared buffers: 25 GB
- Per-connection memory: 500 ร (64 MB + 10 MB overhead) โ 37 GB
- Maintenance: 2 GB
- OS/container overhead: ~5 GB
~69 GB โ fits in 128 GB RAM VMs.
PgBouncer Configurationโ
PgBouncer is used to pool millions of clients while keeping PostgreSQL backend connections low.
version: '3.8'
services:
pg_bouncer:
image: edoburu/pgbouncer
deploy:
replicas: 20 # scale horizontally
restart_policy:
condition: on-failure
resources:
limits:
cpus: '1.0'
memory: 2G
environment:
DATABASE_URL: "postgres://pg_user:pg_password@pg_primary:5432/yourdb"
POOL_MODE: transaction
DEFAULT_POOL_SIZE: 25
MAX_CLIENT_CONN: 50000
networks:
- pg_net
labels:
- "traefik.enable=true"
- "traefik.tcp.routers.pg_bouncer.rule=HostSNI(`*`)"
- "traefik.tcp.routers.pg_bouncer.entrypoints=pg"
- "traefik.tcp.services.pg_bouncer.loadbalancer.server.port=6432"
networks:
pg_net:
driver: overlay
Key points:
-
default_pool_size = 25โ each PgBouncer instance opens 25 backend connections to PostgreSQL. -
max_client_conn = 50,000โ each PgBouncer can accept up to 50k client connections. -
20 PgBouncer instances โ 1 million clients served:
20 ร 50,000 = 1,000,000 clients -
PostgreSQL sees only 20 ร 25 = 500 backend connections โ matches
max_connections.
Traefik Load Balancerโ
Traefik distributes client connections across PgBouncer replicas:
entryPoints:
pg:
address: ":6432"
providers:
docker: {}
- Acts as a TCP load balancer in front of PgBouncer replicas.
- Ensures clients are evenly distributed.
- Supports horizontal scaling: add more PgBouncer instances, and Traefik automatically routes traffic.
Architecture Overviewโ
- PgBouncer queues clients internally.
- PostgreSQL sees only the sum of backend connections.
- Transaction pooling allows short-lived connections โ efficient for millions of clients.
Practical Considerationsโ
Always ensure sum(all PgBouncer default_pool_size) โค PostgreSQL max_connections.
- Max connections: Always ensure the sum of all PgBouncer pool sizes doesn't exceed PostgreSQL's max_connections.
- Scaling: Add PgBouncer replicas and adjust pool sizes for high client counts.
- Monitoring: Watch PG resource usage and PgBouncer queue lengths.
- Networking: High open file limits (
ulimit -n) and TCP tuning required on all VMs. - Persistence: NVMe RAID-10 recommended for PG data and WAL logs.
Summaryโ
- 1 million clients are supported using PgBouncer horizontal scaling + load balancing via Traefik.
- PostgreSQL handles only hundreds of real connections โ safe and efficient.
- Docker Swarm orchestrates PgBouncer replicas and PostgreSQL containers.
- This architecture is production-ready, scalable, and horizontally extensible.
- Monitor your PgBouncer instances with tools like pgBouncer stats
- Set up alerting for connection pool saturation
- Consider read replicas for read-heavy workloads