Kubernetes

Kubernetes Interview Questions: 50 Questions We Actually Ask Candidates (2026)

Engineering Team

Preparing for a Kubernetes interview requires more than memorizing kubectl commands. Employers want engineers who understand the “why” behind the YAML—people who can debug a failing pod at 2 AM and explain their reasoning clearly.

We’ve compiled 50 Kubernetes interview questions that we actually use when hiring DevOps engineers, SREs, and platform engineers. These range from fundamental concepts to advanced troubleshooting scenarios that separate senior candidates from everyone else.

How to Use This Guide

Questions are organized by difficulty:

  • Basic (1-15): Core concepts every Kubernetes user should know
  • Intermediate (16-30): Day-to-day operational knowledge
  • Advanced (31-40): Architecture, security, and design decisions
  • Scenario-Based (41-50): Real-world troubleshooting and problem-solving

For hands-on practice, spin up a local cluster with Minikube or Kind and recreate common issues.


Basic Kubernetes Interview Questions (1-15)

1. What is Kubernetes and why do organizations use it?

Kubernetes is an open-source container orchestration platform that automates deploying, scaling, and managing containerized applications. Originally developed by Google based on 15 years of running containerized workloads, it’s now maintained by the Cloud Native Computing Foundation (CNCF).

Organizations use Kubernetes because it:

  • Automates container deployment across multiple hosts
  • Provides self-healing (restarts failed containers automatically)
  • Enables horizontal scaling based on demand
  • Manages service discovery and load balancing
  • Supports rolling updates with zero downtime

2. Explain Kubernetes architecture and its main components.

Kubernetes follows a client-server architecture with two main layers:

Control Plane (Master):

  • API Server: Front-end for the cluster; all REST commands go through it
  • etcd: Distributed key-value store holding all cluster state
  • Scheduler: Assigns Pods to Nodes based on resource requirements
  • Controller Manager: Runs controllers that maintain desired state (ReplicaSet, Node, etc.)

Data Plane (Worker Nodes):

  • Kubelet: Agent ensuring containers run in Pods as specified
  • Kube-proxy: Maintains network rules for Pod communication
  • Container Runtime: Runs containers (containerd, CRI-O)

For production clusters, understanding this architecture helps with Kubernetes security best practices.

3. What is a Pod and how does it differ from a container?

A Pod is the smallest deployable unit in Kubernetes. It represents one or more containers that:

  • Share the same network namespace (same IP address)
  • Can share storage volumes
  • Are always co-located and co-scheduled

Container vs Pod:

  • A container is a single isolated process
  • A Pod is a logical host for tightly-coupled containers
apiVersion: v1
kind: Pod
metadata:
  name: web-app
spec:
  containers:
  - name: nginx
    image: nginx:1.25
  - name: log-shipper  # Sidecar container
    image: fluent-bit:latest

4. What is a Namespace and when would you use one?

A Namespace is a logical partition within a cluster that provides:

  • Resource isolation between teams or environments
  • Scope for names (objects must be unique within a namespace)
  • Ability to apply resource quotas and RBAC policies

Common use cases:

kubectl get namespaces
# NAME              STATUS   AGE
# default           Active   30d
# kube-system       Active   30d
# production        Active   20d
# staging           Active   20d

Use namespaces to separate development, staging, and production workloads or to isolate different teams.

5. What’s the difference between a Deployment and a StatefulSet?

AspectDeploymentStatefulSet
Use caseStateless applicationsStateful applications (databases)
Pod identityInterchangeable replicasStable, unique network identities
StorageShared or no persistent storageEach Pod gets its own PersistentVolume
ScalingParallel creation/deletionOrdered, sequential operations
Pod namesRandom suffix (web-abc123)Predictable (web-0, web-1, web-2)

Use Deployment for: Web servers, APIs, microservices Use StatefulSet for: Databases, message queues, distributed systems requiring stable identities

6. What is a Service and what types exist?

A Service provides stable network access to a set of Pods. Since Pods are ephemeral and get new IPs when recreated, Services provide a consistent endpoint.

Service Types:

TypeDescriptionUse Case
ClusterIPInternal IP only (default)Service-to-service communication
NodePortExposes on static port (30000-32767)Development, simple external access
LoadBalancerProvisions cloud load balancerProduction external access
ExternalNameDNS CNAME to external serviceAccessing external databases
apiVersion: v1
kind: Service
metadata:
  name: web-service
spec:
  type: ClusterIP
  selector:
    app: web
  ports:
  - port: 80
    targetPort: 8080

7. How do labels and selectors work?

Labels are key-value pairs attached to objects for identification. Selectors query objects based on their labels.

# Pod with labels
metadata:
  labels:
    app: frontend
    environment: production
    version: v2.1.0

# Service selecting Pods
spec:
  selector:
    app: frontend
    environment: production

This loose coupling allows Services to route traffic to any Pod matching the selector, enabling rolling updates without downtime.

8. What is kubectl and what are the most important commands?

kubectl is the command-line tool for interacting with Kubernetes clusters.

Essential commands:

# View resources
kubectl get pods -n <namespace>
kubectl get deployments
kubectl get services

# Detailed information
kubectl describe pod <pod-name>
kubectl logs <pod-name> --follow

# Apply configurations
kubectl apply -f deployment.yaml

# Debugging
kubectl exec -it <pod-name> -- /bin/sh
kubectl port-forward <pod-name> 8080:80

# Context management
kubectl config get-contexts
kubectl config use-context <context-name>

9. What is a ConfigMap and how is it used?

A ConfigMap stores non-confidential configuration data as key-value pairs, decoupling configuration from container images.

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  DATABASE_HOST: "postgres.default.svc"
  LOG_LEVEL: "info"
  config.json: |
    {
      "feature_flags": {
        "new_ui": true
      }
    }

Consuming ConfigMaps:

spec:
  containers:
  - name: app
    env:
    - name: DATABASE_HOST
      valueFrom:
        configMapKeyRef:
          name: app-config
          key: DATABASE_HOST
    volumeMounts:
    - name: config-volume
      mountPath: /etc/config
  volumes:
  - name: config-volume
    configMap:
      name: app-config

For a deeper dive, see our guide on Kubernetes ConfigMaps.

10. What is a Secret and how does it differ from a ConfigMap?

Secrets store sensitive data like passwords, tokens, and SSH keys. Unlike ConfigMaps:

  • Data is base64-encoded (not encrypted by default)
  • Can be encrypted at rest in etcd
  • Access can be restricted via RBAC
apiVersion: v1
kind: Secret
metadata:
  name: db-credentials
type: Opaque
data:
  username: YWRtaW4=  # base64 encoded
  password: cGFzc3dvcmQxMjM=

Best practice: Use external secrets managers like HashiCorp Vault or cloud-native solutions (AWS Secrets Manager, Azure Key Vault) with the External Secrets Operator.

Learn more in our Kubernetes Secrets guide.

11. What is a ReplicaSet and how does it relate to Deployments?

A ReplicaSet ensures a specified number of Pod replicas are running at any time. The relationship:

Deployment → ReplicaSet → Pods
  • Deployments manage ReplicaSets
  • ReplicaSets manage Pods
  • During rolling updates, Deployments create new ReplicaSets while scaling down old ones
  • Old ReplicaSets are retained for rollback capability

You rarely create ReplicaSets directly—use Deployments instead, which provide declarative updates and rollback features.

12. How do you roll back a failed Deployment?

# View rollout history
kubectl rollout history deployment/web-app

# Roll back to previous version
kubectl rollout undo deployment/web-app

# Roll back to specific revision
kubectl rollout undo deployment/web-app --to-revision=2

# Check rollout status
kubectl rollout status deployment/web-app

Kubernetes keeps a history of ReplicaSets, enabling quick rollbacks without redeploying old images.

13. What is a DaemonSet?

A DaemonSet ensures a copy of a Pod runs on all (or selected) Nodes. When Nodes are added, Pods are automatically added; when Nodes are removed, Pods are garbage collected.

Use cases:

  • Log collectors (Fluentd, Fluent Bit)
  • Monitoring agents (Prometheus Node Exporter)
  • Network plugins (Calico, Cilium)
  • Storage daemons
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
spec:
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      containers:
      - name: node-exporter
        image: prom/node-exporter:latest

14. What is an Ingress and how does it differ from a Service?

An Ingress manages external HTTP/HTTPS access to Services, providing:

  • Path-based routing
  • Host-based routing
  • TLS termination
  • Load balancing

Service vs Ingress:

  • Service: Layer 4 (TCP/UDP) load balancing
  • Ingress: Layer 7 (HTTP/HTTPS) routing with URL rules
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-ingress
spec:
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 80
      - path: /
        pathType: Prefix
        backend:
          service:
            name: frontend-service
            port:
              number: 80

Note: Ingress requires an Ingress Controller (NGINX, Traefik, AWS ALB) to function.

15. What are resource requests and limits?

Resource requests and limits control CPU and memory allocation for containers:

  • Requests: Minimum guaranteed resources; used for scheduling decisions
  • Limits: Maximum resources a container can use; exceeding memory limits causes OOMKill
spec:
  containers:
  - name: app
    resources:
      requests:
        memory: "256Mi"
        cpu: "250m"
      limits:
        memory: "512Mi"
        cpu: "500m"

Best practice: Always set requests (for predictable scheduling) and limits (to prevent runaway containers).


Intermediate Kubernetes Interview Questions (16-30)

16. Explain the three types of probes in Kubernetes.

ProbePurposeFailure Action
LivenessIs the container running?Restart container
ReadinessCan the container accept traffic?Remove from Service endpoints
StartupHas the app finished starting?Delay other probes
spec:
  containers:
  - name: app
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 10
      periodSeconds: 5
    readinessProbe:
      httpGet:
        path: /ready
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 3
    startupProbe:
      httpGet:
        path: /healthz
        port: 8080
      failureThreshold: 30
      periodSeconds: 10

17. What is a Horizontal Pod Autoscaler (HPA)?

HPA automatically scales Pod replicas based on observed metrics:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

HPA queries the Metrics Server to get current resource usage and adjusts replicas accordingly.

For advanced autoscaling patterns, see our Kubernetes autoscaling guide.

18. What is Vertical Pod Autoscaler (VPA)?

VPA automatically adjusts CPU and memory requests/limits for containers based on historical usage:

  • Analyzes resource consumption over time
  • Recommends or applies optimal resource settings
  • Helps right-size containers

Modes:

  • Off: Only provides recommendations
  • Auto: Applies recommendations (may restart Pods)
  • Initial: Sets resources only at Pod creation

Learn more in our Kubernetes VPA guide.

19. How do PersistentVolumes (PV) and PersistentVolumeClaims (PVC) work?

PersistentVolume (PV): Cluster-level storage resource provisioned by an admin or dynamically via StorageClass.

PersistentVolumeClaim (PVC): User request for storage that binds to an available PV.

# PVC requesting storage
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: standard

# Pod using the PVC
spec:
  containers:
  - name: app
    volumeMounts:
    - name: data
      mountPath: /data
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: data-pvc

20. What are taints and tolerations?

Taints are applied to Nodes to repel Pods; tolerations allow Pods to schedule on tainted Nodes.

# Taint a node
kubectl taint nodes node1 dedicated=gpu:NoSchedule
# Pod with toleration
spec:
  tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "gpu"
    effect: "NoSchedule"

Effects:

  • NoSchedule: Don’t schedule new Pods
  • PreferNoSchedule: Avoid scheduling if possible
  • NoExecute: Evict existing Pods and don’t schedule new ones

21. Explain Node affinity and Pod affinity/anti-affinity.

Node Affinity: Attracts Pods to specific Nodes based on labels.

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: topology.kubernetes.io/zone
            operator: In
            values:
            - us-east-1a
            - us-east-1b

Pod Affinity/Anti-Affinity: Co-locate or separate Pods based on labels.

spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchLabels:
            app: web
        topologyKey: kubernetes.io/hostname

This ensures web Pods spread across different Nodes for high availability.

22. What are init containers?

Init containers run before app containers start, completing initialization tasks:

spec:
  initContainers:
  - name: init-db
    image: busybox
    command: ['sh', '-c', 'until nc -z postgres 5432; do sleep 2; done']
  - name: init-migrations
    image: myapp:latest
    command: ['./migrate', 'up']
  containers:
  - name: app
    image: myapp:latest

Use cases:

  • Wait for dependencies (databases, services)
  • Run database migrations
  • Download configuration files
  • Set up permissions

23. What is a Network Policy?

Network Policies control traffic flow between Pods at the IP/port level, implementing a zero-trust model:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: api-policy
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: api
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: database
    ports:
    - protocol: TCP
      port: 5432

Note: Requires a CNI plugin that supports Network Policies (Calico, Cilium, Weave).

24. How does DNS work in Kubernetes?

CoreDNS (default DNS provider) enables service discovery using internal DNS names:

<service-name>.<namespace>.svc.cluster.local

Examples:

  • postgres.default.svc.cluster.local → PostgreSQL in default namespace
  • api.production.svc.cluster.local → API service in production

Pods automatically get DNS configuration to resolve these names.

25. What is Helm and why is it useful?

Helm is the package manager for Kubernetes, using “charts” to define, install, and upgrade applications.

Benefits:

  • Templating for environment-specific values
  • Versioned releases with rollback support
  • Dependency management
  • Reusable, shareable packages
# Install a chart
helm install my-release bitnami/postgresql

# Upgrade with new values
helm upgrade my-release bitnami/postgresql -f values-prod.yaml

# Rollback
helm rollback my-release 1

26. What is the Kubernetes Scheduler and how does it work?

The Scheduler assigns Pods to Nodes through:

  1. Filtering: Removes Nodes that can’t run the Pod (insufficient resources, taints, affinity rules)
  2. Scoring: Ranks remaining Nodes using priority functions
  3. Binding: Assigns Pod to highest-scoring Node

Scheduling factors:

  • Resource requests/limits
  • Node selectors and affinity
  • Taints and tolerations
  • Pod topology spread constraints

27. What is etcd and why is it critical?

etcd is a distributed, consistent key-value store that holds all cluster state:

  • Desired state (what you want)
  • Current state (what exists)
  • Configuration and secrets

Best practices:

  • Run in a cluster (minimum 3 nodes) for high availability
  • Regular backups: etcdctl snapshot save backup.db
  • Encrypt secrets at rest
  • Limit direct access; use API server

28. Explain rolling update strategy.

Rolling updates replace old Pods with new ones incrementally:

spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%        # Max Pods above desired count
      maxUnavailable: 25%  # Max Pods unavailable during update

Process:

  1. Create new ReplicaSet with updated spec
  2. Scale up new ReplicaSet, scale down old
  3. Repeat until all Pods are updated
  4. Old ReplicaSet retained (replica=0) for rollback

29. What are Pod Disruption Budgets (PDBs)?

PDBs limit voluntary disruptions to maintain application availability:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-pdb
spec:
  minAvailable: 2  # or use maxUnavailable
  selector:
    matchLabels:
      app: web

PDBs are respected during:

  • Node drains (kubectl drain)
  • Cluster upgrades
  • Voluntary evictions

They’re not respected during involuntary disruptions (node crashes, OOM kills).

30. What is kube-proxy and how does it work?

kube-proxy runs on every Node, implementing Service networking:

Modes:

  • iptables (default): Creates iptables rules for each Service
  • IPVS: Uses kernel IPVS for better performance at scale
  • userspace: Legacy mode, rarely used

kube-proxy:

  • Watches API server for Service/Endpoint changes
  • Updates network rules accordingly
  • Enables Pods to reach Services via ClusterIP

Advanced Kubernetes Interview Questions (31-40)

31. What are Custom Resource Definitions (CRDs)?

CRDs extend the Kubernetes API with custom resource types:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.example.com
spec:
  group: example.com
  versions:
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              engine:
                type: string
              size:
                type: string
  scope: Namespaced
  names:
    plural: databases
    singular: database
    kind: Database

After creating the CRD, you can create instances:

apiVersion: example.com/v1
kind: Database
metadata:
  name: my-postgres
spec:
  engine: postgresql
  size: large

32. What is a Kubernetes Operator?

An Operator combines CRDs with custom controllers to automate application management. It encodes operational knowledge (how to deploy, scale, backup, upgrade) into software.

Examples:

  • Prometheus Operator
  • PostgreSQL Operator
  • Elasticsearch Operator

Operators handle tasks like:

  • Automated backups and restores
  • Scaling decisions
  • Version upgrades
  • Failure recovery

33. Explain RBAC in Kubernetes.

Role-Based Access Control uses four resources:

ResourceScopePurpose
RoleNamespaceDefines permissions within a namespace
ClusterRoleCluster-wideDefines permissions cluster-wide
RoleBindingNamespaceGrants Role to users/groups/service accounts
ClusterRoleBindingCluster-wideGrants ClusterRole cluster-wide
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: production
  name: pod-reader
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods
  namespace: production
subjects:
- kind: User
  name: developer@example.com
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

34. What is GitOps and how does it apply to Kubernetes?

GitOps uses Git as the single source of truth for declarative infrastructure. Tools like Argo CD or Flux continuously reconcile cluster state with Git repositories.

Benefits:

  • Version-controlled infrastructure changes
  • Audit trail through Git history
  • Pull request-based change management
  • Automatic drift detection and correction

Learn more in our DevOps Kubernetes playbook.

35. How do you implement canary deployments in Kubernetes?

Native Kubernetes doesn’t support canary deployments directly. Options:

1. Multiple Deployments with Service weights:

# Stable: 90% traffic
replicas: 9
# Canary: 10% traffic
replicas: 1

2. Argo Rollouts:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
spec:
  strategy:
    canary:
      steps:
      - setWeight: 10
      - pause: {duration: 1h}
      - setWeight: 50
      - pause: {duration: 1h}

3. Service Mesh (Istio): Use VirtualService to split traffic by percentage.

36. What is a Service Mesh and when would you use one?

A Service Mesh (Istio, Linkerd, Cilium) provides:

  • mTLS: Encrypted service-to-service communication
  • Traffic management: Canary, A/B testing, retries, timeouts
  • Observability: Distributed tracing, metrics, logging
  • Policy enforcement: Rate limiting, access control

Use when:

  • You need zero-trust security between services
  • Complex traffic routing requirements
  • Detailed observability across microservices
  • Multiple teams deploying independently

Avoid when:

  • Simple architectures (< 10 services)
  • Team lacks service mesh expertise
  • Overhead isn’t justified

37. How do you secure a Kubernetes cluster?

Control Plane:

  • Enable RBAC with least-privilege access
  • Encrypt etcd at rest
  • Enable audit logging
  • Restrict API server access

Workloads:

  • Use Pod Security Standards (restricted mode)
  • Run containers as non-root
  • Set read-only root filesystem
  • Define resource limits

Network:

  • Implement Network Policies (default deny)
  • Use service mesh for mTLS
  • Isolate namespaces

Supply Chain:

  • Scan images for vulnerabilities
  • Sign images with Cosign
  • Use admission controllers (OPA Gatekeeper, Kyverno)
# Pod Security Standard: restricted
apiVersion: v1
kind: Namespace
metadata:
  name: secure-ns
  labels:
    pod-security.kubernetes.io/enforce: restricted

38. What are admission controllers?

Admission controllers intercept API requests after authentication/authorization but before persistence. They can validate or mutate requests.

Built-in controllers:

  • NamespaceLifecycle: Prevents operations in terminating namespaces
  • LimitRanger: Enforces default resource constraints
  • PodSecurity: Enforces Pod Security Standards

Custom controllers:

# Kyverno policy requiring labels
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-labels
spec:
  validationFailureAction: enforce
  rules:
  - name: check-team-label
    match:
      resources:
        kinds:
        - Pod
    validate:
      message: "label 'team' is required"
      pattern:
        metadata:
          labels:
            team: "?*"

39. How do you handle multi-cluster Kubernetes?

Approaches:

ApproachUse Case
FederationSync resources across clusters
Service MeshCross-cluster service discovery (Istio multi-cluster)
GitOpsDeploy same config to multiple clusters via Git
Cluster APIProvision and manage cluster lifecycle

Tools:

40. What is the difference between imperative and declarative configuration?

Imperative: Tell Kubernetes what to do step-by-step

kubectl create deployment nginx --image=nginx
kubectl scale deployment nginx --replicas=3
kubectl expose deployment nginx --port=80

Declarative: Define desired state; let Kubernetes figure out how

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 3
  ...
kubectl apply -f deployment.yaml

Best practice: Use declarative configuration for production. It’s:

  • Version-controllable
  • Repeatable
  • Self-documenting
  • GitOps-friendly

Scenario-Based Interview Questions (41-50)

41. A Pod is stuck in CrashLoopBackOff. How do you debug it?

# 1. Check Pod status and events
kubectl describe pod <pod-name>

# 2. Check container logs (including previous crashed container)
kubectl logs <pod-name> --previous

# 3. Look for OOMKilled in status
kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[0].lastState}'

# 4. Check if it's a probe issue
kubectl get pod <pod-name> -o yaml | grep -A 10 livenessProbe

# 5. Try running interactively
kubectl run debug --image=<image> --rm -it -- /bin/sh

Common causes:

  • Application error (check logs)
  • Missing dependencies or config
  • Failed liveness probe
  • OOM killed (increase memory limit)
  • Image pull issues

42. A Service isn’t routing traffic to Pods. What do you check?

# 1. Verify Service selector matches Pod labels
kubectl get svc <service-name> -o wide
kubectl get pods --show-labels

# 2. Check Endpoints (should list Pod IPs)
kubectl get endpoints <service-name>
# Empty endpoints = selector doesn't match any Pods

# 3. Verify Pods are Ready
kubectl get pods
# Not Ready = won't receive traffic

# 4. Test from within the cluster
kubectl run test --image=busybox --rm -it -- wget -qO- <service-name>:80

# 5. Check Network Policies blocking traffic
kubectl get networkpolicies

43. How do you investigate high memory usage in a Pod?

# 1. Check current usage
kubectl top pod <pod-name>

# 2. Compare against limits
kubectl get pod <pod-name> -o jsonpath='{.spec.containers[0].resources}'

# 3. Check for OOMKilled events
kubectl describe pod <pod-name> | grep -i oom

# 4. Exec into Pod and check processes
kubectl exec -it <pod-name> -- top

# 5. Review application metrics (if available)
kubectl port-forward <pod-name> 9090:9090
# Check /metrics endpoint

Solutions:

  • Increase memory limits
  • Fix memory leaks in application
  • Add horizontal scaling
  • Review VPA recommendations

44. A node becomes NotReady. What happens and how do you respond?

What happens automatically:

  • Pods marked as Unknown
  • After 5 minutes, Pods evicted and rescheduled (if using Deployments)
  • Node Controller marks node as unschedulable

Investigation:

# 1. Check node status
kubectl describe node <node-name>

# 2. Check kubelet status (SSH to node)
systemctl status kubelet
journalctl -u kubelet -n 100

# 3. Check system resources
df -h  # Disk space
free -m  # Memory

# 4. Check node conditions
kubectl get node <node-name> -o jsonpath='{.status.conditions}'

Common causes:

  • Kubelet crashed
  • Disk pressure
  • Memory pressure
  • Network issues

45. How do you perform a zero-downtime upgrade of your application?

# 1. Configure rolling update strategy
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0  # Never reduce below desired count

# 2. Set readiness probe
readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

# 3. Configure PDB
apiVersion: policy/v1
kind: PodDisruptionBudget
spec:
  minAvailable: 2
# 4. Apply new version
kubectl set image deployment/web web=myapp:v2.0

# 5. Monitor rollout
kubectl rollout status deployment/web

46. How do you handle secrets rotation without restarting Pods?

Option 1: External Secrets with refresh

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
spec:
  refreshInterval: 1h

Option 2: Reloader controller Use Stakater Reloader to automatically restart Pods when ConfigMaps/Secrets change.

Option 3: Application-level reload Design application to watch for file changes and reload secrets without restart.

47. A deployment is running but requests are timing out. How do you diagnose?

# 1. Check if Pods are Ready
kubectl get pods -l app=myapp

# 2. Check Service endpoints
kubectl get endpoints myapp-service

# 3. Test connectivity from another Pod
kubectl run debug --rm -it --image=busybox -- sh
wget -qO- --timeout=5 myapp-service:80

# 4. Check Pod logs for errors
kubectl logs -l app=myapp --tail=100

# 5. Check Network Policies
kubectl get networkpolicies -A

# 6. Check if Ingress is configured correctly
kubectl describe ingress myapp-ingress

# 7. Check resource usage (CPU throttling?)
kubectl top pods -l app=myapp

48. How would you migrate a stateful application to Kubernetes?

  1. Assess the application:

    • Data persistence requirements
    • Network identity needs
    • Scaling characteristics
  2. Choose the right workload type:

    • StatefulSet for ordered deployment
    • Operators for complex applications (databases)
  3. Plan storage:

    volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: fast-ssd
        resources:
          requests:
            storage: 100Gi
  4. Handle data migration:

    • Use backup/restore tools
    • Set up replication before cutover
    • Plan for rollback
  5. Consider managed operators:

    • CloudNativePG for PostgreSQL
    • Strimzi for Kafka
    • MongoDB Community Operator

49. How do you implement cost optimization in a Kubernetes cluster?

# 1. Right-size resources using VPA recommendations
kubectl get vpa -A

# 2. Identify over-provisioned Pods
kubectl top pods -A --sort-by=cpu

# 3. Set resource requests/limits on all workloads
kubectl get pods -A -o json | jq '.items[] | select(.spec.containers[].resources.requests == null) | .metadata.name'

Strategies:

  • Use Cluster Autoscaler to scale nodes down
  • Implement spot/preemptible instances for non-critical workloads
  • Use namespace resource quotas
  • Schedule batch jobs during off-peak hours
  • Use tools like Kubecost for visibility

See our guide on Kubernetes cost optimization.

50. Describe your approach to upgrading a production Kubernetes cluster.

  1. Preparation:

    # Review release notes
    # Back up etcd
    etcdctl snapshot save backup.db
    
    # Document current versions
    kubectl version
    kubectl get nodes -o wide
  2. Pre-flight checks:

    • Verify all Pods healthy
    • Check PodDisruptionBudgets
    • Ensure backup/restore tested
    • Plan rollback procedure
  3. Upgrade control plane:

    • One master at a time (HA clusters)
    • Upgrade in order: API server → controller manager → scheduler
  4. Upgrade worker nodes:

    # Cordon node (prevent new Pods)
    kubectl cordon node1
    
    # Drain node (evict Pods)
    kubectl drain node1 --ignore-daemonsets
    
    # Upgrade kubelet and kubectl
    # Uncordon node
    kubectl uncordon node1
  5. Post-upgrade:

    • Verify all components healthy
    • Update add-ons (CNI, CoreDNS, Ingress)
    • Test critical applications
    • Monitor for issues

Interview Preparation Tips

Before the Interview

  1. Hands-on practice: Set up a local cluster with Minikube or Kind. Break things intentionally and fix them.

  2. Review core concepts: Understand the “why” behind each resource type, not just syntax.

  3. Know your experience: Be ready to discuss specific Kubernetes challenges you’ve solved.

  4. Stay current: Review recent Kubernetes news and updates.

During the Interview

  1. Think out loud: Explain your debugging process step-by-step.

  2. Ask clarifying questions: “Is this a single-tenant or multi-tenant cluster?” “What CNI are they using?”

  3. Acknowledge what you don’t know: It’s better to say “I’d need to look that up” than to guess incorrectly.

  4. Connect to real experience: “In my last role, we handled this by…”


Build Kubernetes Expertise with Expert Guidance

Preparing for Kubernetes interviews—or building production-ready clusters—is easier with experienced guidance.

Our Kubernetes consulting services help teams:

  • Architect production clusters on AWS EKS, Azure AKS, or Google GKE
  • Implement security best practices from RBAC to Network Policies
  • Optimize costs with right-sizing and autoscaling
  • Train engineering teams on Kubernetes operations

We’ve helped organizations from startups to enterprises build reliable, scalable Kubernetes platforms.

Schedule a Kubernetes consultation →

Chat with real humans
Chat on WhatsApp