PostgreSQL on Kubernetes: Production Setup We Actually Run

Running databases on Kubernetes used to be a bad idea. In 2026, with mature operators like CloudNativePG, it is a viable production option — if you do it right.

We run PostgreSQL on Kubernetes for several clients. We also recommend managed databases (RDS, Cloud SQL) for others. This guide covers when each approach makes sense, and how to set up PostgreSQL on Kubernetes for production.

When to Run PostgreSQL on Kubernetes

Use Kubernetes-native PostgreSQL when:

You want a single platform for everything (apps + databases)
You need to run across multiple clouds or on-premise
You want database-per-tenant isolation in a multi-tenant SaaS
Your team already manages Kubernetes and wants to reduce external dependencies
Cost matters — self-managed Postgres on Kubernetes can be 40-60% cheaper than RDS

Use managed PostgreSQL (RDS, Cloud SQL, Azure Database) when:

Your team does not have deep Kubernetes and PostgreSQL expertise
You want zero operational overhead for database management
You need cross-region replication with minimal setup
You are running on a single cloud provider with no portability requirements
Your database is business-critical and you want vendor-backed SLAs

The honest answer: If in doubt, use managed. Running databases on Kubernetes adds operational complexity that is only justified when the benefits above outweigh the cost of managing it yourself.

Why CloudNativePG

There are several PostgreSQL operators for Kubernetes. We use CloudNativePG because:

CNCF Sandbox project — backed by the Cloud Native Computing Foundation
Does not use StatefulSets — manages pods and PVCs directly for better control over failover
Native streaming replication — built on PostgreSQL’s native replication, not custom solutions
Automated failover — promotes replicas to primary in seconds
Backup to object storage — continuous WAL archiving to S3/GCS/Azure Blob
Point-in-time recovery — restore to any second using WAL replay
Built-in Prometheus metrics — no additional exporters needed

Other viable options: Zalando Postgres Operator (more mature, uses Patroni), CrunchyData PGO (enterprise-focused). CloudNativePG is our default for new deployments.

Production Setup

Step 1: Install the Operator

# Install CloudNativePG operator
kubectl apply --server-side -f \
  https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.25/releases/cnpg-1.25.0.yaml

# Verify installation
kubectl get deployment -n cnpg-system cnpg-controller-manager

Or with Helm:

helm repo add cnpg https://cloudnative-pg.github.io/charts
helm install cnpg cnpg/cloudnative-pg -n cnpg-system --create-namespace

Step 2: Create a Production Cluster

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: app-db
  namespace: production
spec:
  instances: 3    # 1 primary + 2 replicas

  # PostgreSQL version
  imageName: ghcr.io/cloudnative-pg/postgresql:16.6

  # Storage
  storage:
    size: 100Gi
    storageClass: gp3-encrypted    # AWS EBS gp3 with encryption

  # Resource allocation
  resources:
    requests:
      cpu: "2"
      memory: 4Gi
    limits:
      memory: 8Gi

  # PostgreSQL configuration
  postgresql:
    parameters:
      shared_buffers: "1GB"
      effective_cache_size: "3GB"
      work_mem: "64MB"
      maintenance_work_mem: "256MB"
      max_connections: "200"
      max_wal_size: "2GB"
      min_wal_size: "512MB"
      wal_level: "replica"
      max_parallel_workers_per_gather: "4"
      random_page_cost: "1.1"        # SSD storage

  # High availability
  minSyncReplicas: 1
  maxSyncReplicas: 1

  # Backup to S3
  backup:
    barmanObjectStore:
      destinationPath: "s3://my-pg-backups/app-db/"
      s3Credentials:
        accessKeyId:
          name: s3-creds
          key: ACCESS_KEY_ID
        secretAccessKey:
          name: s3-creds
          key: ACCESS_SECRET_KEY
      wal:
        compression: gzip
        maxParallel: 4
      data:
        compression: gzip
    retentionPolicy: "30d"

  # Anti-affinity — spread replicas across nodes
  affinity:
    enablePodAntiAffinity: true
    topologyKey: kubernetes.io/hostname

  # Monitoring
  monitoring:
    enablePodMonitor: true
    customQueriesConfigMap:
    - name: custom-pg-metrics
      key: queries

Key configuration decisions:

3 instances — 1 primary, 2 replicas. Minimum for production HA. The primary handles writes, replicas handle reads and serve as failover candidates.
Synchronous replication (minSyncReplicas: 1) — at least one replica confirms every write. This prevents data loss during failover at the cost of slightly higher write latency.
Anti-affinity — ensures primary and replicas run on different nodes. If a node fails, the database survives.
gp3 storage — AWS EBS gp3 provides consistent IOPS. Use gp3-encrypted for encryption at rest.

Step 3: Configure Backups

CloudNativePG does continuous backup using PostgreSQL’s WAL (Write-Ahead Log) archiving. Every transaction is streamed to S3 in near real-time.

Schedule regular base backups:

apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
  name: daily-backup
  namespace: production
spec:
  schedule: "0 2 * * *"    # 2 AM daily
  backupOwnerReference: self
  cluster:
    name: app-db
  immediate: true

Test your backups. A backup that has never been restored is not a backup. Schedule monthly restore tests:

# Restore to a point in time (for testing)
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: restore-test
  namespace: testing
spec:
  instances: 1
  storage:
    size: 100Gi
    storageClass: gp3-encrypted

  bootstrap:
    recovery:
      source: app-db
      recoveryTarget:
        targetTime: "2026-03-18T14:00:00Z"    # Restore to this timestamp

  externalClusters:
  - name: app-db
    barmanObjectStore:
      destinationPath: "s3://my-pg-backups/app-db/"
      s3Credentials:
        accessKeyId:
          name: s3-creds
          key: ACCESS_KEY_ID
        secretAccessKey:
          name: s3-creds
          key: ACCESS_SECRET_KEY

Step 4: Connect Your Application

CloudNativePG creates Kubernetes Services automatically:

Service	Purpose
`app-db-rw`	Read-write (primary only)
`app-db-ro`	Read-only (replicas only)
`app-db-r`	Read (any instance)

Connect your application using the read-write service for writes and read-only service for read replicas:

# Application deployment
env:
- name: DATABASE_URL
  value: "postgresql://app:$(DB_PASSWORD)@app-db-rw:5432/myapp?sslmode=require"
- name: DATABASE_URL_READONLY
  value: "postgresql://app:$(DB_PASSWORD)@app-db-ro:5432/myapp?sslmode=require"
- name: DB_PASSWORD
  valueFrom:
    secretKeyRef:
      name: app-db-app
      key: password

CloudNativePG automatically generates database credentials and stores them in Kubernetes Secrets. The secret app-db-app is created automatically.

Step 5: Monitoring

CloudNativePG exposes Prometheus metrics natively. If you have Prometheus installed, add a PodMonitor:

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: cnpg-metrics
  namespace: production
spec:
  selector:
    matchLabels:
      cnpg.io/cluster: app-db
  podMetricsEndpoints:
  - port: metrics

Essential Grafana dashboards:

CloudNativePG provides a pre-built Grafana dashboard that shows:

Replication lag between primary and replicas
Transaction throughput (TPS)
Active connections vs max connections
Buffer cache hit ratio
WAL generation rate
Disk usage and growth rate

Critical alerts to set:

# PrometheusRule for PostgreSQL alerts
groups:
- name: postgresql
  rules:
  - alert: PostgreSQLReplicationLagHigh
    expr: cnpg_pg_replication_streaming_replicas_wal_lag_bytes > 100000000
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "PostgreSQL replication lag > 100MB"

  - alert: PostgreSQLConnectionsHigh
    expr: cnpg_pg_stat_activity_count / cnpg_pg_settings_setting{name="max_connections"} > 0.8
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "PostgreSQL connections > 80% of max"

  - alert: PostgreSQLDiskUsageHigh
    expr: cnpg_pg_database_size_bytes / cnpg_pg_volume_size_bytes > 0.85
    for: 10m
    labels:
      severity: critical
    annotations:
      summary: "PostgreSQL disk usage > 85%"

Failover: What Happens When the Primary Dies

CloudNativePG handles automatic failover:

The operator detects the primary pod is unhealthy (liveness probe fails)
The most up-to-date replica is promoted to primary (typically < 10 seconds)
The app-db-rw Service automatically points to the new primary
Applications reconnect transparently (ensure your connection pool handles reconnects)
A new replica is created to maintain the desired instance count

Important: Your application must handle database reconnection gracefully. Use a connection pooler (PgBouncer, built into CloudNativePG) and configure retry logic in your application.

# Enable PgBouncer pooler
apiVersion: postgresql.cnpg.io/v1
kind: Pooler
metadata:
  name: app-db-pooler-rw
  namespace: production
spec:
  cluster:
    name: app-db
  instances: 2
  type: rw
  pgbouncer:
    poolMode: transaction
    parameters:
      max_client_conn: "1000"
      default_pool_size: "50"

Cost Comparison: CloudNativePG vs RDS

For a production setup (primary + 2 replicas, 100GB storage, 2 vCPU, 4GB RAM):

Component	CloudNativePG on EKS	RDS Multi-AZ
Compute	~$150/mo (shared EKS nodes)	~$350/mo (db.r6g.large)
Storage	~$30/mo (gp3 100GB x 3)	~$35/mo (gp3 100GB)
Backup storage	~$5/mo (S3)	~$10/mo (RDS backup)
Operator/management	$0 (open source)	Included
Total	~$185/mo	~$395/mo
Operational effort	Medium (your team manages)	Low (AWS manages)

CloudNativePG is ~53% cheaper but requires your team to manage upgrades, troubleshoot replication issues, and handle edge cases. The cost savings are real, but so is the operational overhead.

Common Mistakes

1. Skipping anti-affinity. Without pod anti-affinity, Kubernetes might schedule all 3 database pods on the same node. If that node fails, you lose all replicas simultaneously. Always set enablePodAntiAffinity: true.

2. Using Deployment instead of an operator. A Deployment with a PostgreSQL container is not a database cluster. It has no replication, no failover, and no backup management. Use an operator or use managed databases.

3. Not testing backups. Backups streaming to S3 are useless if you have never restored one. Schedule monthly restore tests to verify your recovery process works.

4. Undersizing storage. Growing a PVC is possible but risky. Start with 2x your expected data size and monitor growth rate. Running out of disk space is the most common Kubernetes database incident we see.

5. Missing connection pooling. PostgreSQL’s per-connection memory overhead is significant. Without PgBouncer, 500 application connections can overwhelm a database that could handle the query load easily. Always use connection pooling.

Need Help Running Databases on Kubernetes?

We set up and manage PostgreSQL on Kubernetes using CloudNativePG — from initial deployment to ongoing operations and performance tuning.

Our PostgreSQL consulting services cover:

Architecture design — choose between CloudNativePG, managed RDS, or hybrid approaches
Production setup — HA clusters with automated backup, failover, and monitoring
Migration — move from RDS, standalone PostgreSQL, or other databases to Kubernetes-native PostgreSQL
Performance tuning — query optimisation, connection pooling, and resource right-sizing
Kubernetes cluster setup — EKS, AKS, or GKE optimised for stateful workloads

Talk to our database team →

PostgreSQL on Kubernetes: Production Setup We Actually Run

When to Run PostgreSQL on Kubernetes

Why CloudNativePG

Production Setup

Step 1: Install the Operator

Step 2: Create a Production Cluster

Step 3: Configure Backups

Step 4: Connect Your Application

Step 5: Monitoring

Failover: What Happens When the Primary Dies

Cost Comparison: CloudNativePG vs RDS

Common Mistakes

Need Help Running Databases on Kubernetes?

CloudNativePG 1.29 Features Most Production Teams Miss

Resize Kubernetes Pods Without a Restart: 1.35 Is GA

Best Kubernetes Backup Tools in 2026 (9 Options Compared)

Velero vs Kasten K10 vs Portworx PX-Backup: 2026 Comparison

Kubernetes Consulting Cost 2026: Real Rates From 100+ Quotes

Need Kubernetes expertise?

Tasrie IT Support

Start a conversation

When to Run PostgreSQL on Kubernetes

Why CloudNativePG

Production Setup

Step 1: Install the Operator

Step 2: Create a Production Cluster

Step 3: Configure Backups

Step 4: Connect Your Application

Step 5: Monitoring

Failover: What Happens When the Primary Dies

Cost Comparison: CloudNativePG vs RDS

Common Mistakes

Need Help Running Databases on Kubernetes?

Related Articles

CloudNativePG 1.29 Features Most Production Teams Miss

Resize Kubernetes Pods Without a Restart: 1.35 Is GA

Best Kubernetes Backup Tools in 2026 (9 Options Compared)

Velero vs Kasten K10 vs Portworx PX-Backup: 2026 Comparison

Kubernetes Consulting Cost 2026: Real Rates From 100+ Quotes

Need Kubernetes expertise?

One Production Insight a Week

What you'll get

Subscribe to weekly insights

You're subscribed.

Tasrie IT Support

Start a conversation