We had a problem. Four applications. Four separate PostgreSQL clusters. 91MB of actual data spread across 50Gi of allocated storage.

Each cluster created its own PodDisruptionBudget. Node drains during maintenance became difficult. Karpenter struggled to consolidate workloads.

This is the story of how we consolidated everything into a single HA cluster with PgBouncer connection pooling.

TL;DR

We consolidated 4 CloudNativePG clusters into 1 shared cluster with PgBouncer. The key gotcha: ArgoCD sync waves are required when using ExternalSecrets with CNPG, or the cluster bootstraps with the wrong password.

Results:

4 PDBs reduced to 1
50Gi storage reduced to 30Gi (40% less)
High availability added (2 replicas with streaming replication)
Connection pooling enabled via PgBouncer
Single backup schedule instead of 4

The Problem

Our platform ran four internal applications, each with its own CloudNativePG database:

App	Instances	Storage	CPU	Memory	Data Size
n8n	1	20Gi	100m	256Mi	11 MB
vaultwarden	1	5Gi	100m	256Mi	9 MB
wikijs	1	5Gi	100m	256Mi	10 MB
zammad	1	20Gi	250m	512Mi	61 MB
Total	4 pods	50Gi	550m	1.3Gi	91 MB

91MB of data. 50Gi of provisioned storage. Four separate upgrade cycles to coordinate.

The real issue was PodDisruptionBudgets. Each CNPG cluster creates a PDB that blocks node drains until replicas are ready. With single-instance clusters, nodes could not drain at all during maintenance windows.

Why These 4 Apps?

Before consolidating, we analyzed access patterns and compatibility:

Low write frequency - All four apps have minimal concurrent writes. No risk of lock contention.
Independent schemas - Each app uses its own database. No shared tables or cross-database queries.
Similar SLAs - All are internal tools. A brief maintenance window is acceptable.
Non-critical workloads - These support internal workflows, not customer-facing production.

Critical apps keep their own clusters. Our production databases (customer data, billing, auth) remain on dedicated CNPG clusters with separate backup schedules, stricter PDBs, and isolated failure domains. Consolidation is not a universal solution—it fits workloads with compatible access patterns and similar reliability requirements.

The Solution

One shared PostgreSQL cluster with PgBouncer connection pooling. This also gave us High Availability—something the individual single-instance clusters lacked.

Component	Instances	Storage	CPU	Memory
platform-db	2 (HA)	30Gi	500m	1Gi
pgbouncer	2	-	100m	128Mi
Total	4 pods	30Gi	700m	1.3Gi

Same pod count. Less storage. High availability included.

The Architecture

Before: 4 Separate Clusters

Before: 4 Separate PostgreSQL Clusters

Each application had its own dedicated CloudNativePG cluster. Four PodDisruptionBudgets. Four backup schedules. 50Gi of storage for 91MB of data.

After: 1 Shared Cluster

After: 1 Shared PostgreSQL Cluster with PgBouncer

Applications connect to platform-db-pooler.platform-db.svc on port 5432. PgBouncer handles connection pooling in transaction mode. The CNPG cluster manages replication and failover automatically.

The Gotcha: Sync Waves Required

Our first deployment failed. The init job could not authenticate to the database.

FATAL: password authentication failed for user "postgres"

Root cause: ArgoCD synced all resources in parallel. The CNPG Cluster bootstrapped before ExternalSecret finished syncing the superuser password from AWS Secrets Manager. CNPG generated its own random password. The init job read the (now-synced) secret with a different password.

The fix: ArgoCD sync waves.

# ExternalSecrets must sync FIRST (wave -1)
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
  name: platform-db-superuser
  annotations:
    argocd.argoproj.io/sync-wave: '-1'

# CNPG Cluster syncs AFTER secrets exist (wave 0)
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: platform-db
  annotations:
    argocd.argoproj.io/sync-wave: '0'

Sync order:

Wave -1: ExternalSecrets create Kubernetes secrets from AWS Secrets Manager
Wave 0: CNPG Cluster reads superuserSecret (now exists with correct password)
PostSync: Init job creates database users

This is not documented anywhere. If you use CNPG with ExternalSecrets and ArgoCD, you need sync waves.

The Migration

We migrated apps in order of risk. Lowest risk first.

Migration Steps (per app)

Scale down the application
Dump from old cluster using postgres superuser
Restore to shared cluster
Grant permissions to app user
Update Helm values to point to new host
Push changes via GitOps
Scale up and verify

Example: Wikijs Migration

# 1. Scale down
kubectl scale deployment p11-wikijs -n wikijs --replicas=0

# 2. Dump from old cluster
kubectl exec wikijs-db-1 -n wikijs -c postgres -- \
  pg_dump -U postgres -d wikijs --no-owner --no-acl > wikijs.sql

# 3. Restore to shared cluster
cat wikijs.sql | kubectl exec -i platform-db-1 -n platform-db -c postgres -- \
  psql -U postgres -d wikijs

# 4. Grant permissions
kubectl exec platform-db-1 -n platform-db -c postgres -- \
  psql -U postgres -d wikijs -c "GRANT ALL ON ALL TABLES IN SCHEMA public TO wikijs"

Then update the Helm values:

# Before
cloudnativepg:
  enabled: true
wiki:
  postgresql:
    postgresqlHost: wikijs-db-rw

# After
cloudnativepg:
  enabled: false
wiki:
  postgresql:
    postgresqlHost: platform-db-pooler.platform-db.svc

Commit, push, wait for ArgoCD sync, scale up, verify.

Migration Summary

App	Tables Migrated	Result
wikijs	34	1/1 Running
n8n	56	1/1 Running
vaultwarden	30	1/1 Running
zammad	120	All Running
Total	240	All successful

Important Notes

Use postgres superuser for dump/restore

Peer authentication fails for app users when using kubectl exec. Always use the postgres superuser:

# This fails
kubectl exec db-1 -c postgres -- pg_dump -U appuser -d appdb

# This works
kubectl exec db-1 -c postgres -- pg_dump -U postgres -d appdb --no-owner --no-acl

Grant permissions after restore

After restoring as postgres, grant ownership to the app user:

GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO appuser;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO appuser;

Wait for ArgoCD sync before deleting resources

If you manually delete Kubernetes resources that ArgoCD manages, ArgoCD recreates them on the next reconciliation. Wait for ArgoCD to sync the new values first. Then it prunes the old resources automatically.

EKS Auto Mode ArgoCD

If you run ArgoCD as an EKS Auto Mode capability, the CLI requires token-based authentication. Standard argocd login does not work. See ArgoCD CLI Login on EKS Auto Mode for the correct setup.

Results

Metric	Before	After	Change
CNPG Clusters	4	1	-75%
Total Pods	4	4	Same
Storage	50Gi	30Gi	-40%
CPU Requests	550m	~250m	-300m
Memory	1.3Gi	~500Mi	-800Mi
PDBs	4	1	-75%
HA Replicas	0	1	Added
Backup Jobs	4	1	-75%
Conn Pooling	No	Yes	Added

The consolidation reduced operational overhead without sacrificing reliability. Node drains work smoothly with a single PDB. Backups are simpler with one schedule. Connection pooling reduces database load.

We saved approximately 300m CPU and 800Mi memory. More importantly, we eliminated 3 PodDisruptionBudgets that were blocking node consolidation.

Lessons Learned

Analyze access patterns first. Not all databases should be consolidated. We chose these four because they have low write frequency, independent schemas, and similar SLAs. Critical production databases stay isolated.
Consolidation can add HA, not just reduce costs. Our single-instance clusters had no replicas. The shared cluster runs PRIMARY + REPLICA with automatic failover. We improved reliability while reducing overhead.
Sync waves are mandatory for CNPG + ExternalSecrets. The superuser secret must exist before the Cluster resource. Use wave -1 for ExternalSecrets, wave 0 for Cluster.
Reuse existing secrets. We already had per-app database credentials in AWS Secrets Manager. No need to create new ones.
Migrate in order of risk. Start with the least critical app. Build confidence before touching production data.
Do not manually delete ArgoCD-managed resources. Wait for the sync. Let ArgoCD prune automatically.
PgBouncer transaction mode works for most apps. All four applications worked without changes to connection handling.

Running multiple PostgreSQL clusters on Kubernetes? As an AWS Partner, ZSoftly provides Kubernetes consulting and database optimization for Canadian companies. Talk to us

Sources

CloudNativePG Documentation - CloudNativePG
CloudNativePG Pooler - PgBouncer integration
ArgoCD Sync Waves - ArgoCD Documentation
External Secrets Operator - ESO Documentation

How We Consolidated 4 PostgreSQL Clusters into 1 with CloudNativePG