Skip to main content

1 Million Postgres Connections

ยท 3 min read
DevOps Team
DevOps & Infrastructure Team

Scaling PostgreSQL to handle 1 million concurrent clients is impossible directly, but with proper architecture, pooling, and load balancing, it can be achieved safely.

This article explains a practical setup using:

  • PostgreSQL primary and replicas
  • PgBouncer connection poolers
  • Traefik TCP load balancer
  • Docker Swarm orchestration

Server Specificationsโ€‹

For a production-grade setup:

ComponentCountVM/HostCPURAMStorage
PostgreSQL Primary1VM116128 GBNVMe 1 TB RAID-10
PostgreSQL Replica2VM1 + VM216128 GBNVMe 1 TB RAID-10
PgBouncer (Docker Swarm)20โ€“100VM3 (Manager)864 GBShared network
Docker Swarm Manager1VM3864 GBFor orchestration
info

PostgreSQL sees only the sum of all PgBouncer backend connections, not the 1 million clients.


PostgreSQL Configurationโ€‹

In each PostgreSQL container:

postgresql.conf
max_connections = 500             # realistic
shared_buffers = 25GB # ~25% of RAM
work_mem = 64MB
maintenance_work_mem = 2GB

Memory calculation:

  • Shared buffers: 25 GB
  • Per-connection memory: 500 ร— (64 MB + 10 MB overhead) โ‰ˆ 37 GB
  • Maintenance: 2 GB
  • OS/container overhead: ~5 GB
Total per PG container

~69 GB โ†’ fits in 128 GB RAM VMs.


PgBouncer Configurationโ€‹

PgBouncer is used to pool millions of clients while keeping PostgreSQL backend connections low.

docker-compose.yml
version: '3.8'
services:
pg_bouncer:
image: edoburu/pgbouncer
deploy:
replicas: 20 # scale horizontally
restart_policy:
condition: on-failure
resources:
limits:
cpus: '1.0'
memory: 2G
environment:
DATABASE_URL: "postgres://pg_user:pg_password@pg_primary:5432/yourdb"
POOL_MODE: transaction
DEFAULT_POOL_SIZE: 25
MAX_CLIENT_CONN: 50000
networks:
- pg_net
labels:
- "traefik.enable=true"
- "traefik.tcp.routers.pg_bouncer.rule=HostSNI(`*`)"
- "traefik.tcp.routers.pg_bouncer.entrypoints=pg"
- "traefik.tcp.services.pg_bouncer.loadbalancer.server.port=6432"

networks:
pg_net:
driver: overlay

Key points:

  • default_pool_size = 25 โ†’ each PgBouncer instance opens 25 backend connections to PostgreSQL.

  • max_client_conn = 50,000 โ†’ each PgBouncer can accept up to 50k client connections.

  • 20 PgBouncer instances โ†’ 1 million clients served:

    20 ร— 50,000 = 1,000,000 clients
  • PostgreSQL sees only 20 ร— 25 = 500 backend connections โ†’ matches max_connections.


Traefik Load Balancerโ€‹

Traefik distributes client connections across PgBouncer replicas:

traefik.yml
entryPoints:
pg:
address: ":6432"

providers:
docker: {}
  • Acts as a TCP load balancer in front of PgBouncer replicas.
  • Ensures clients are evenly distributed.
  • Supports horizontal scaling: add more PgBouncer instances, and Traefik automatically routes traffic.

Architecture Overviewโ€‹

  • PgBouncer queues clients internally.
  • PostgreSQL sees only the sum of backend connections.
  • Transaction pooling allows short-lived connections โ†’ efficient for millions of clients.

Practical Considerationsโ€‹

Important

Always ensure sum(all PgBouncer default_pool_size) โ‰ค PostgreSQL max_connections.

  1. Max connections: Always ensure the sum of all PgBouncer pool sizes doesn't exceed PostgreSQL's max_connections.
  2. Scaling: Add PgBouncer replicas and adjust pool sizes for high client counts.
  3. Monitoring: Watch PG resource usage and PgBouncer queue lengths.
  4. Networking: High open file limits (ulimit -n) and TCP tuning required on all VMs.
  5. Persistence: NVMe RAID-10 recommended for PG data and WAL logs.

Summaryโ€‹

  • 1 million clients are supported using PgBouncer horizontal scaling + load balancing via Traefik.
  • PostgreSQL handles only hundreds of real connections โ†’ safe and efficient.
  • Docker Swarm orchestrates PgBouncer replicas and PostgreSQL containers.
  • This architecture is production-ready, scalable, and horizontally extensible.
Next Steps
  • Monitor your PgBouncer instances with tools like pgBouncer stats
  • Set up alerting for connection pool saturation
  • Consider read replicas for read-heavy workloads