Kubernetes

CloudNativePG 1.29 Features Most Production Teams Miss

CloudNativePG 1.29 adds Image Catalogs, shared ServiceAccount for IAM, and podSelectorRefs for network security. Here is what each means for production PostgreSQL on Kubernetes.

Engineering Team
8 min read

CloudNativePG 1.29 is the most security-focused release the project has shipped. Three features - Image Catalogs for extension management, shared ServiceAccount support for cloud IAM, and podSelectorRefs for dynamic network access control - change how teams should architect PostgreSQL on Kubernetes. If you are running CloudNativePG in production today, these are the changes that warrant an upgrade.

This post covers what each feature does, how to configure it, and the operational gotchas that the release notes do not mention.

What Is CloudNativePG?

CloudNativePG is the CNCF-sandboxed PostgreSQL operator for Kubernetes. Unlike the Zalando operator, which wraps Patroni and StatefulSets, CloudNativePG manages pods and PVCs directly through a custom controller. The result is faster, more predictable failover - typically under 10 seconds for primary promotion - and tighter integration with Kubernetes primitives.

Version 1.29 was released in June 2026. If you are running 1.27.x, note that 1.27.4 is the final release in that series.

What Is New in CloudNativePG 1.29?

Image Catalogs: PostgreSQL Extensions Without Custom Dockerfiles

The biggest operational pain with CloudNativePG before 1.29 was extensions. If you needed pgvector, PostGIS, or pg_partman, you had to build and maintain a custom PostgreSQL image, push it to a registry, and reference it in your Cluster spec. Any PostgreSQL minor version bump required rebuilding every custom image.

Image Catalogs solve this. They provide a structured, automated way to distribute and manage extension-specific images across versions. Instead of maintaining your own Dockerfile per extension combination, you reference a catalog entry that the CNPG project maintains and signs.

The practical impact:

  • Teams using pgvector for AI workloads no longer need a CI pipeline just to keep PostgreSQL patch releases current
  • Extension images are versioned and signed as part of the CNPG release artifacts
  • You can run different PostgreSQL minor versions across clusters while using the same catalog definition

This is the feature that makes CloudNativePG genuinely viable for teams with non-trivial extension requirements. Before 1.29, “we need pgvector and PostGIS together” typically meant a bespoke build pipeline. Now it does not.

Shared ServiceAccount: AWS IRSA, GCP Workload Identity, and Azure Workload Identity

Before 1.29, CloudNativePG created and managed its own ServiceAccount per cluster. This worked for simple setups, but it created a problem for cloud IAM integration. AWS IRSA, GCP Workload Identity, and Azure Workload Identity all work by annotating a ServiceAccount with a role ARN or identity config. Because CloudNativePG managed the ServiceAccount, those annotations were overwritten on operator reconciliation.

The workaround was annotating the ServiceAccount after each reconciliation - fragile and operationally unpleasant.

1.29 fixes this. You can now reference a pre-existing ServiceAccount in both the Cluster and Pooler resources:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: my-cluster
spec:
  instances: 3
  serviceAccountTemplate:
    metadata:
      name: my-existing-sa
  storage:
    size: 100Gi
  backup:
    barmanObjectStore:
      destinationPath: s3://my-bucket/my-cluster/
      s3Credentials:
        inheritFromIAMRole: true

Platform teams can now annotate the ServiceAccount once with the IAM role, and the operator uses it without overwriting it. This is the correct pattern for:

  • S3 backups via IRSA (no static AWS credentials stored in the cluster)
  • GCS backups via Workload Identity (no service account JSON files)
  • Azure Blob backups via Azure Workload Identity (no client secrets)

If your cluster’s backup destination is object storage on any of the major clouds, this feature is the right way to handle authentication from 1.29 onward. Static credentials in Kubernetes secrets are not a good backup auth strategy for production.

podSelectorRefs: Dynamic Network Access Control Without CIDR Ranges

This is the most underrated feature in 1.29. PostgreSQL’s pg_hba.conf controls which hosts can connect and how they authenticate. In Kubernetes, pod IPs are ephemeral - they change on every restart. Maintaining static IP ranges in pg_hba.conf is either too permissive (broad CIDR blocks) or requires constant manual updates after pod restarts.

podSelectorRefs lets you define access rules using Kubernetes label selectors instead of static IPs. CloudNativePG resolves the current pod IPs at runtime and updates pg_hba.conf dynamically.

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: my-cluster
spec:
  postgresql:
    pg_hba:
      - hostssl app appuser all scram-sha-256
  networking:
    podSelectorRefs:
      - matchLabels:
          app: my-backend
          environment: production

Only pods labeled app=my-backend and environment=production can connect. When those pods restart and get new IPs, CloudNativePG updates the rules automatically. No CIDR ranges. No manual updates. No “allow everything from the namespace subnet” shortcut.

For teams following Kubernetes secrets and network hardening practices, this closes the lateral movement risk where any pod in the same namespace could reach the database.

Is CloudNativePG Better Than the Zalando Operator in 2026?

CloudNativePG is the right default for new deployments. The Zalando operator is more mature and battle-tested at scale, but it uses Patroni and StatefulSets - an architecture that predates Kubernetes-native operators. PR velocity on the Zalando operator has slowed, and there is no official commercial support path.

CloudNativePG’s custom pod controller gives it real advantages:

  • Faster failover - no Patroni DCS dependency for leader election
  • Direct control over PVC lifecycle during failover scenarios
  • Native Kubernetes RBAC and network policy integration
  • CNCF governance, not dependent on a single company’s internal priorities

CrunchyData PGO is largely inactive as an open-source project. If you are evaluating operators today, CloudNativePG or Zalando are the two serious options - and CloudNativePG has significantly more momentum heading into 2026.

Production Gotchas CloudNativePG Documentation Does Not Cover

The S3 Prefix Naming Problem That Breaks PITR

This one catches teams at the worst possible moment - during disaster recovery.

CloudNativePG backup configuration uses two fields that must point to different S3 prefixes:

  • backup.barmanObjectStore.serverName - where WAL is written
  • recovery.barmanObjectStore.serverName - where WAL is read from during recovery

If these match, Barman detects existing data at the write destination and refuses to archive new WAL to avoid overwriting a valid recovery chain. The result is barman-cloud-wal-archive: exit status 1 and a silent backup failure.

This happens most often after a PITR restore. You restore from prefix my-cluster, then spin up the recovered cluster also writing to my-cluster. Same prefix, instant conflict.

The pattern that works: increment the write prefix after every restore.

Initial cluster:     serverName: my-cluster
After first PITR:    serverName: my-cluster-r1
After second PITR:   serverName: my-cluster-r2

Build this naming convention into your cluster specs from day one. Retrofitting it during an incident is not a good experience.

WAL Archiving From Replicas Is Not a Valid Backup Strategy

CloudNativePG supports WAL streaming from replicas, but streaming WAL from a replica is not the same as archiving WAL from the primary. Replicas lag. Under network pressure they can miss segments or receive incomplete WAL. If replica WAL streaming is your only backup mechanism, you may discover gaps in your WAL archive exactly when you need a complete recovery chain.

WAL archiving must run from the primary. Use replicas for read scaling and connection offloading, not as your recovery source.

PgBouncer Pooler Parameter Validation Gap

CloudNativePG’s built-in Pooler resource is the correct way to handle connection pooling - it integrates with TLS, authentication, and Prometheus metrics automatically. One issue: the operator does not validate PgBouncer parameter values before applying them.

A typo in pgbouncer.parameters will not produce a deployment error. It silently misconfigures the pooler. If you are setting non-default parameters (pool_mode, max_client_conn, server_idle_timeout), test them in staging first. A misconfigured pooler can disrupt connection handling for the entire cluster without obvious errors in the operator logs.

How Do You Monitor CloudNativePG in Production?

CloudNativePG exposes Prometheus metrics natively - no additional exporters needed. The operator pods expose metrics on port 9187, and PgBouncer poolers expose metrics on port 9127 with the cnpg_pgbouncer_ prefix.

The metrics worth alerting on:

  • cnpg_collector_pg_wal_archive_status - WAL archiving failures (this is your backup health signal; treat it as critical)
  • cnpg_collector_pg_replication_lag - replica lag; alert on sustained values above your RPO threshold
  • cnpg_collector_backends_total - active connections; alert when approaching max_connections
  • cnpg_pgbouncer_stats_total_wait_time - client wait time in PgBouncer; rising values indicate pool saturation

If you are already using Prometheus for Kubernetes monitoring, CloudNativePG integrates without additional scrape configuration. The operator’s ServiceMonitor is deployed alongside the operator itself.

The CloudNativePG Grafana dashboard covers the core metrics. For production use, extend it with custom panels for WAL archive lag and PgBouncer queue depth.

Supply Chain Security in 1.29

CloudNativePG 1.29 ships SLSA provenance and Software Bill of Materials generation for all release artifacts and container images. Images are signed via OpenSSF integration. For teams operating under SOC 2, ISO 27001, or FedRAMP requirements, this means CloudNativePG artifacts are verifiable from source to deployment.

This is becoming expected for any infrastructure component in 2026. The fact that CloudNativePG ships it for a database operator - not just application images - is a meaningful differentiator. Most operators do not have this in their release process yet.

What About Running Databases on Kubernetes at All?

The decision is not always CloudNativePG vs Zalando. It is sometimes CloudNativePG vs RDS. Our cloud-native database guide for 2026 covers when self-managed operators win and when managed databases are the better call. The short version: use CloudNativePG when you need multi-cloud portability, database-per-tenant isolation, or want to eliminate managed database costs at scale. Use RDS or Cloud SQL when operational simplicity and vendor SLAs matter more.

For the specifics of StatefulSets, PVC management, and storage class selection for database workloads on Kubernetes, see our Kubernetes StatefulSets and databases guide.


CloudNativePG Deployment and Consulting

Running CloudNativePG in production requires more than following the installation guide. Backup strategy, IAM integration, connection pooling, and monitoring each have production-specific decisions that documentation leaves open.

Tasrie IT Services provides PostgreSQL consulting services to help teams deploy and operate CloudNativePG correctly:

  • Backup architecture - WAL archiving with correct S3 prefix strategy, PITR testing, and recovery runbooks
  • IAM integration - IRSA, Workload Identity, and Azure Workload Identity setup for credential-free backups
  • Connection pooling - PgBouncer Pooler configuration tuned for your application’s connection pattern
  • Monitoring and alerting - Prometheus metrics, Grafana dashboards, and alerting rules for database health

We also integrate CloudNativePG into broader Kubernetes platforms as part of our Kubernetes consulting engagements.

Talk to a PostgreSQL and Kubernetes specialist at Tasrie IT Services →

E

Engineering Team

Published on June 19, 2026

Continue exploring these related topics

Ready to get started?

Need Kubernetes expertise?

From architecture to production support, we help teams run Kubernetes reliably at scale.

Get started
Chat with real humans
Chat on WhatsApp