Kubernetes Interview Questions: 50 Questions We Actually Ask Candidates (2026)

Preparing for a Kubernetes interview requires more than memorizing kubectl commands. Employers want engineers who understand the “why” behind the YAML—people who can debug a failing pod at 2 AM and explain their reasoning clearly.

We’ve compiled 50 Kubernetes interview questions that we actually use when hiring DevOps engineers, SREs, and platform engineers. These range from fundamental concepts to advanced troubleshooting scenarios that separate senior candidates from everyone else.

How to Use This Guide

Questions are organized by difficulty:

Basic (1-15): Core concepts every Kubernetes user should know
Intermediate (16-30): Day-to-day operational knowledge
Advanced (31-40): Architecture, security, and design decisions
Scenario-Based (41-50): Real-world troubleshooting and problem-solving

For hands-on practice, spin up a local cluster with Minikube or Kind and recreate common issues.

Basic Kubernetes Interview Questions (1-15)

1. What is Kubernetes and why do organizations use it?

Kubernetes is an open-source container orchestration platform that automates deploying, scaling, and managing containerized applications. Originally developed by Google based on 15 years of running containerized workloads, it’s now maintained by the Cloud Native Computing Foundation (CNCF).

Organizations use Kubernetes because it:

Automates container deployment across multiple hosts
Provides self-healing (restarts failed containers automatically)
Enables horizontal scaling based on demand
Manages service discovery and load balancing
Supports rolling updates with zero downtime

2. Explain Kubernetes architecture and its main components.

Kubernetes follows a client-server architecture with two main layers:

Control Plane (Master):

API Server: Front-end for the cluster; all REST commands go through it
etcd: Distributed key-value store holding all cluster state
Scheduler: Assigns Pods to Nodes based on resource requirements
Controller Manager: Runs controllers that maintain desired state (ReplicaSet, Node, etc.)

Data Plane (Worker Nodes):

Kubelet: Agent ensuring containers run in Pods as specified
Kube-proxy: Maintains network rules for Pod communication
Container Runtime: Runs containers (containerd, CRI-O)

For production clusters, understanding this architecture helps with Kubernetes security best practices.

3. What is a Pod and how does it differ from a container?

A Pod is the smallest deployable unit in Kubernetes. It represents one or more containers that:

Share the same network namespace (same IP address)
Can share storage volumes
Are always co-located and co-scheduled

Container vs Pod:

A container is a single isolated process
A Pod is a logical host for tightly-coupled containers

apiVersion: v1
kind: Pod
metadata:
  name: web-app
spec:
  containers:
  - name: nginx
    image: nginx:1.25
  - name: log-shipper  # Sidecar container
    image: fluent-bit:latest

4. What is a Namespace and when would you use one?

A Namespace is a logical partition within a cluster that provides:

Resource isolation between teams or environments
Scope for names (objects must be unique within a namespace)
Ability to apply resource quotas and RBAC policies

Common use cases:

kubectl get namespaces
# NAME              STATUS   AGE
# default           Active   30d
# kube-system       Active   30d
# production        Active   20d
# staging           Active   20d

Use namespaces to separate development, staging, and production workloads or to isolate different teams.

5. What’s the difference between a Deployment and a StatefulSet?

Aspect	Deployment	StatefulSet
Use case	Stateless applications	Stateful applications (databases)
Pod identity	Interchangeable replicas	Stable, unique network identities
Storage	Shared or no persistent storage	Each Pod gets its own PersistentVolume
Scaling	Parallel creation/deletion	Ordered, sequential operations
Pod names	Random suffix (web-abc123)	Predictable (web-0, web-1, web-2)

Use Deployment for: Web servers, APIs, microservices Use StatefulSet for: Databases, message queues, distributed systems requiring stable identities

6. What is a Service and what types exist?

A Service provides stable network access to a set of Pods. Since Pods are ephemeral and get new IPs when recreated, Services provide a consistent endpoint.

Service Types:

Type	Description	Use Case
ClusterIP	Internal IP only (default)	Service-to-service communication
NodePort	Exposes on static port (30000-32767)	Development, simple external access
LoadBalancer	Provisions cloud load balancer	Production external access
ExternalName	DNS CNAME to external service	Accessing external databases

apiVersion: v1
kind: Service
metadata:
  name: web-service
spec:
  type: ClusterIP
  selector:
    app: web
  ports:
  - port: 80
    targetPort: 8080

7. How do labels and selectors work?

Labels are key-value pairs attached to objects for identification. Selectors query objects based on their labels.

# Pod with labels
metadata:
  labels:
    app: frontend
    environment: production
    version: v2.1.0

# Service selecting Pods
spec:
  selector:
    app: frontend
    environment: production

This loose coupling allows Services to route traffic to any Pod matching the selector, enabling rolling updates without downtime.

8. What is kubectl and what are the most important commands?

kubectl is the command-line tool for interacting with Kubernetes clusters.

Essential commands:

# View resources
kubectl get pods -n <namespace>
kubectl get deployments
kubectl get services

# Detailed information
kubectl describe pod <pod-name>
kubectl logs <pod-name> --follow

# Apply configurations
kubectl apply -f deployment.yaml

# Debugging
kubectl exec -it <pod-name> -- /bin/sh
kubectl port-forward <pod-name> 8080:80

# Context management
kubectl config get-contexts
kubectl config use-context <context-name>

9. What is a ConfigMap and how is it used?

A ConfigMap stores non-confidential configuration data as key-value pairs, decoupling configuration from container images.

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  DATABASE_HOST: "postgres.default.svc"
  LOG_LEVEL: "info"
  config.json: |
    {
      "feature_flags": {
        "new_ui": true
      }
    }

Consuming ConfigMaps:

spec:
  containers:
  - name: app
    env:
    - name: DATABASE_HOST
      valueFrom:
        configMapKeyRef:
          name: app-config
          key: DATABASE_HOST
    volumeMounts:
    - name: config-volume
      mountPath: /etc/config
  volumes:
  - name: config-volume
    configMap:
      name: app-config

For a deeper dive, see our guide on Kubernetes ConfigMaps.

10. What is a Secret and how does it differ from a ConfigMap?

Secrets store sensitive data like passwords, tokens, and SSH keys. Unlike ConfigMaps:

Data is base64-encoded (not encrypted by default)
Can be encrypted at rest in etcd
Access can be restricted via RBAC

apiVersion: v1
kind: Secret
metadata:
  name: db-credentials
type: Opaque
data:
  username: YWRtaW4=  # base64 encoded
  password: cGFzc3dvcmQxMjM=

Best practice: Use external secrets managers like HashiCorp Vault or cloud-native solutions (AWS Secrets Manager, Azure Key Vault) with the External Secrets Operator.

Learn more in our Kubernetes Secrets guide.

11. What is a ReplicaSet and how does it relate to Deployments?

A ReplicaSet ensures a specified number of Pod replicas are running at any time. The relationship:

Deployment → ReplicaSet → Pods

Deployments manage ReplicaSets
ReplicaSets manage Pods
During rolling updates, Deployments create new ReplicaSets while scaling down old ones
Old ReplicaSets are retained for rollback capability

You rarely create ReplicaSets directly—use Deployments instead, which provide declarative updates and rollback features.

12. How do you roll back a failed Deployment?

# View rollout history
kubectl rollout history deployment/web-app

# Roll back to previous version
kubectl rollout undo deployment/web-app

# Roll back to specific revision
kubectl rollout undo deployment/web-app --to-revision=2

# Check rollout status
kubectl rollout status deployment/web-app

Kubernetes keeps a history of ReplicaSets, enabling quick rollbacks without redeploying old images.

13. What is a DaemonSet?

A DaemonSet ensures a copy of a Pod runs on all (or selected) Nodes. When Nodes are added, Pods are automatically added; when Nodes are removed, Pods are garbage collected.

Use cases:

Log collectors (Fluentd, Fluent Bit)
Monitoring agents (Prometheus Node Exporter)
Network plugins (Calico, Cilium)
Storage daemons

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
spec:
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      containers:
      - name: node-exporter
        image: prom/node-exporter:latest

14. What is an Ingress and how does it differ from a Service?

An Ingress manages external HTTP/HTTPS access to Services, providing:

Path-based routing
Host-based routing
TLS termination
Load balancing

Service vs Ingress:

Service: Layer 4 (TCP/UDP) load balancing
Ingress: Layer 7 (HTTP/HTTPS) routing with URL rules

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-ingress
spec:
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 80
      - path: /
        pathType: Prefix
        backend:
          service:
            name: frontend-service
            port:
              number: 80

Note: Ingress requires an Ingress Controller (NGINX, Traefik, AWS ALB) to function.

15. What are resource requests and limits?

Resource requests and limits control CPU and memory allocation for containers:

Requests: Minimum guaranteed resources; used for scheduling decisions
Limits: Maximum resources a container can use; exceeding memory limits causes OOMKill

spec:
  containers:
  - name: app
    resources:
      requests:
        memory: "256Mi"
        cpu: "250m"
      limits:
        memory: "512Mi"
        cpu: "500m"

Best practice: Always set requests (for predictable scheduling) and limits (to prevent runaway containers).

Intermediate Kubernetes Interview Questions (16-30)

16. Explain the three types of probes in Kubernetes.

Probe	Purpose	Failure Action
Liveness	Is the container running?	Restart container
Readiness	Can the container accept traffic?	Remove from Service endpoints
Startup	Has the app finished starting?	Delay other probes

spec:
  containers:
  - name: app
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 10
      periodSeconds: 5
    readinessProbe:
      httpGet:
        path: /ready
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 3
    startupProbe:
      httpGet:
        path: /healthz
        port: 8080
      failureThreshold: 30
      periodSeconds: 10

17. What is a Horizontal Pod Autoscaler (HPA)?

HPA automatically scales Pod replicas based on observed metrics:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

HPA queries the Metrics Server to get current resource usage and adjusts replicas accordingly.

For advanced autoscaling patterns, see our Kubernetes autoscaling guide.

18. What is Vertical Pod Autoscaler (VPA)?

VPA automatically adjusts CPU and memory requests/limits for containers based on historical usage:

Analyzes resource consumption over time
Recommends or applies optimal resource settings
Helps right-size containers

Modes:

Off: Only provides recommendations
Auto: Applies recommendations (may restart Pods)
Initial: Sets resources only at Pod creation

Learn more in our Kubernetes VPA guide.

19. How do PersistentVolumes (PV) and PersistentVolumeClaims (PVC) work?

PersistentVolume (PV): Cluster-level storage resource provisioned by an admin or dynamically via StorageClass.

PersistentVolumeClaim (PVC): User request for storage that binds to an available PV.

# PVC requesting storage
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: standard

# Pod using the PVC
spec:
  containers:
  - name: app
    volumeMounts:
    - name: data
      mountPath: /data
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: data-pvc

20. What are taints and tolerations?

Taints are applied to Nodes to repel Pods; tolerations allow Pods to schedule on tainted Nodes.

# Taint a node
kubectl taint nodes node1 dedicated=gpu:NoSchedule

# Pod with toleration
spec:
  tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "gpu"
    effect: "NoSchedule"

Effects:

NoSchedule: Don’t schedule new Pods
PreferNoSchedule: Avoid scheduling if possible
NoExecute: Evict existing Pods and don’t schedule new ones

21. Explain Node affinity and Pod affinity/anti-affinity.

Node Affinity: Attracts Pods to specific Nodes based on labels.

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: topology.kubernetes.io/zone
            operator: In
            values:
            - us-east-1a
            - us-east-1b

Pod Affinity/Anti-Affinity: Co-locate or separate Pods based on labels.

spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchLabels:
            app: web
        topologyKey: kubernetes.io/hostname

This ensures web Pods spread across different Nodes for high availability.

22. What are init containers?

Init containers run before app containers start, completing initialization tasks:

spec:
  initContainers:
  - name: init-db
    image: busybox
    command: ['sh', '-c', 'until nc -z postgres 5432; do sleep 2; done']
  - name: init-migrations
    image: myapp:latest
    command: ['./migrate', 'up']
  containers:
  - name: app
    image: myapp:latest

Use cases:

Wait for dependencies (databases, services)
Run database migrations
Download configuration files
Set up permissions

23. What is a Network Policy?

Network Policies control traffic flow between Pods at the IP/port level, implementing a zero-trust model:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: api-policy
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: api
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: database
    ports:
    - protocol: TCP
      port: 5432

Note: Requires a CNI plugin that supports Network Policies (Calico, Cilium, Weave).

24. How does DNS work in Kubernetes?

CoreDNS (default DNS provider) enables service discovery using internal DNS names:

<service-name>.<namespace>.svc.cluster.local

Examples:

postgres.default.svc.cluster.local → PostgreSQL in default namespace
api.production.svc.cluster.local → API service in production

Pods automatically get DNS configuration to resolve these names.

25. What is Helm and why is it useful?

Helm is the package manager for Kubernetes, using “charts” to define, install, and upgrade applications.

Benefits:

Templating for environment-specific values
Versioned releases with rollback support
Dependency management
Reusable, shareable packages

# Install a chart
helm install my-release bitnami/postgresql

# Upgrade with new values
helm upgrade my-release bitnami/postgresql -f values-prod.yaml

# Rollback
helm rollback my-release 1

26. What is the Kubernetes Scheduler and how does it work?

The Scheduler assigns Pods to Nodes through:

Filtering: Removes Nodes that can’t run the Pod (insufficient resources, taints, affinity rules)
Scoring: Ranks remaining Nodes using priority functions
Binding: Assigns Pod to highest-scoring Node

Scheduling factors:

Resource requests/limits
Node selectors and affinity
Taints and tolerations
Pod topology spread constraints

27. What is etcd and why is it critical?

etcd is a distributed, consistent key-value store that holds all cluster state:

Desired state (what you want)
Current state (what exists)
Configuration and secrets

Best practices:

Run in a cluster (minimum 3 nodes) for high availability
Regular backups: etcdctl snapshot save backup.db
Encrypt secrets at rest
Limit direct access; use API server

28. Explain rolling update strategy.

Rolling updates replace old Pods with new ones incrementally:

spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%        # Max Pods above desired count
      maxUnavailable: 25%  # Max Pods unavailable during update

Process:

Create new ReplicaSet with updated spec
Scale up new ReplicaSet, scale down old
Repeat until all Pods are updated
Old ReplicaSet retained (replica=0) for rollback

29. What are Pod Disruption Budgets (PDBs)?

PDBs limit voluntary disruptions to maintain application availability:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-pdb
spec:
  minAvailable: 2  # or use maxUnavailable
  selector:
    matchLabels:
      app: web

PDBs are respected during:

Node drains (kubectl drain)
Cluster upgrades
Voluntary evictions

They’re not respected during involuntary disruptions (node crashes, OOM kills).

30. What is kube-proxy and how does it work?

kube-proxy runs on every Node, implementing Service networking:

Modes:

iptables (default): Creates iptables rules for each Service
IPVS: Uses kernel IPVS for better performance at scale
userspace: Legacy mode, rarely used

kube-proxy:

Watches API server for Service/Endpoint changes
Updates network rules accordingly
Enables Pods to reach Services via ClusterIP

Advanced Kubernetes Interview Questions (31-40)

31. What are Custom Resource Definitions (CRDs)?

CRDs extend the Kubernetes API with custom resource types:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.example.com
spec:
  group: example.com
  versions:
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              engine:
                type: string
              size:
                type: string
  scope: Namespaced
  names:
    plural: databases
    singular: database
    kind: Database

After creating the CRD, you can create instances:

apiVersion: example.com/v1
kind: Database
metadata:
  name: my-postgres
spec:
  engine: postgresql
  size: large

32. What is a Kubernetes Operator?

An Operator combines CRDs with custom controllers to automate application management. It encodes operational knowledge (how to deploy, scale, backup, upgrade) into software.

Examples:

Prometheus Operator
PostgreSQL Operator
Elasticsearch Operator

Operators handle tasks like:

Automated backups and restores
Scaling decisions
Version upgrades
Failure recovery

33. Explain RBAC in Kubernetes.

Role-Based Access Control uses four resources:

Resource	Scope	Purpose
Role	Namespace	Defines permissions within a namespace
ClusterRole	Cluster-wide	Defines permissions cluster-wide
RoleBinding	Namespace	Grants Role to users/groups/service accounts
ClusterRoleBinding	Cluster-wide	Grants ClusterRole cluster-wide

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: production
  name: pod-reader
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods
  namespace: production
subjects:
- kind: User
  name: developer@example.com
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

34. What is GitOps and how does it apply to Kubernetes?

GitOps uses Git as the single source of truth for declarative infrastructure. Tools like Argo CD or Flux continuously reconcile cluster state with Git repositories.

Benefits:

Version-controlled infrastructure changes
Audit trail through Git history
Pull request-based change management
Automatic drift detection and correction

Learn more in our DevOps Kubernetes playbook.

35. How do you implement canary deployments in Kubernetes?

Native Kubernetes doesn’t support canary deployments directly. Options:

1. Multiple Deployments with Service weights:

# Stable: 90% traffic
replicas: 9
# Canary: 10% traffic
replicas: 1

2. Argo Rollouts:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
spec:
  strategy:
    canary:
      steps:
      - setWeight: 10
      - pause: {duration: 1h}
      - setWeight: 50
      - pause: {duration: 1h}

3. Service Mesh (Istio): Use VirtualService to split traffic by percentage.

36. What is a Service Mesh and when would you use one?

A Service Mesh (Istio, Linkerd, Cilium) provides:

mTLS: Encrypted service-to-service communication
Traffic management: Canary, A/B testing, retries, timeouts
Observability: Distributed tracing, metrics, logging
Policy enforcement: Rate limiting, access control

Use when:

You need zero-trust security between services
Complex traffic routing requirements
Detailed observability across microservices
Multiple teams deploying independently

Avoid when:

Simple architectures (< 10 services)
Team lacks service mesh expertise
Overhead isn’t justified

37. How do you secure a Kubernetes cluster?

Control Plane:

Enable RBAC with least-privilege access
Encrypt etcd at rest
Enable audit logging
Restrict API server access

Workloads:

Use Pod Security Standards (restricted mode)
Run containers as non-root
Set read-only root filesystem
Define resource limits

Network:

Implement Network Policies (default deny)
Use service mesh for mTLS
Isolate namespaces

Supply Chain:

Scan images for vulnerabilities
Sign images with Cosign
Use admission controllers (OPA Gatekeeper, Kyverno)

# Pod Security Standard: restricted
apiVersion: v1
kind: Namespace
metadata:
  name: secure-ns
  labels:
    pod-security.kubernetes.io/enforce: restricted

38. What are admission controllers?

Admission controllers intercept API requests after authentication/authorization but before persistence. They can validate or mutate requests.

Built-in controllers:

NamespaceLifecycle: Prevents operations in terminating namespaces
LimitRanger: Enforces default resource constraints
PodSecurity: Enforces Pod Security Standards

Custom controllers:

OPA Gatekeeper: Policy enforcement using Rego
Kyverno: Kubernetes-native policies

# Kyverno policy requiring labels
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-labels
spec:
  validationFailureAction: enforce
  rules:
  - name: check-team-label
    match:
      resources:
        kinds:
        - Pod
    validate:
      message: "label 'team' is required"
      pattern:
        metadata:
          labels:
            team: "?*"

39. How do you handle multi-cluster Kubernetes?

Approaches:

Approach	Use Case
Federation	Sync resources across clusters
Service Mesh	Cross-cluster service discovery (Istio multi-cluster)
GitOps	Deploy same config to multiple clusters via Git
Cluster API	Provision and manage cluster lifecycle

Tools:

Rancher: Multi-cluster management UI
Argo CD ApplicationSets: Deploy to multiple clusters
Cilium Cluster Mesh: Cross-cluster networking

40. What is the difference between imperative and declarative configuration?

Imperative: Tell Kubernetes what to do step-by-step

kubectl create deployment nginx --image=nginx
kubectl scale deployment nginx --replicas=3
kubectl expose deployment nginx --port=80

Declarative: Define desired state; let Kubernetes figure out how

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 3
  ...

kubectl apply -f deployment.yaml

Best practice: Use declarative configuration for production. It’s:

Version-controllable
Repeatable
Self-documenting
GitOps-friendly

Scenario-Based Interview Questions (41-50)

41. A Pod is stuck in CrashLoopBackOff. How do you debug it?

# 1. Check Pod status and events
kubectl describe pod <pod-name>

# 2. Check container logs (including previous crashed container)
kubectl logs <pod-name> --previous

# 3. Look for OOMKilled in status
kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[0].lastState}'

# 4. Check if it's a probe issue
kubectl get pod <pod-name> -o yaml | grep -A 10 livenessProbe

# 5. Try running interactively
kubectl run debug --image=<image> --rm -it -- /bin/sh

Common causes:

Application error (check logs)
Missing dependencies or config
Failed liveness probe
OOM killed (increase memory limit)
Image pull issues

42. A Service isn’t routing traffic to Pods. What do you check?

# 1. Verify Service selector matches Pod labels
kubectl get svc <service-name> -o wide
kubectl get pods --show-labels

# 2. Check Endpoints (should list Pod IPs)
kubectl get endpoints <service-name>
# Empty endpoints = selector doesn't match any Pods

# 3. Verify Pods are Ready
kubectl get pods
# Not Ready = won't receive traffic

# 4. Test from within the cluster
kubectl run test --image=busybox --rm -it -- wget -qO- <service-name>:80

# 5. Check Network Policies blocking traffic
kubectl get networkpolicies

43. How do you investigate high memory usage in a Pod?

# 1. Check current usage
kubectl top pod <pod-name>

# 2. Compare against limits
kubectl get pod <pod-name> -o jsonpath='{.spec.containers[0].resources}'

# 3. Check for OOMKilled events
kubectl describe pod <pod-name> | grep -i oom

# 4. Exec into Pod and check processes
kubectl exec -it <pod-name> -- top

# 5. Review application metrics (if available)
kubectl port-forward <pod-name> 9090:9090
# Check /metrics endpoint

Solutions:

Increase memory limits
Fix memory leaks in application
Add horizontal scaling
Review VPA recommendations

44. A node becomes NotReady. What happens and how do you respond?

What happens automatically:

Pods marked as Unknown
After 5 minutes, Pods evicted and rescheduled (if using Deployments)
Node Controller marks node as unschedulable

Investigation:

# 1. Check node status
kubectl describe node <node-name>

# 2. Check kubelet status (SSH to node)
systemctl status kubelet
journalctl -u kubelet -n 100

# 3. Check system resources
df -h  # Disk space
free -m  # Memory

# 4. Check node conditions
kubectl get node <node-name> -o jsonpath='{.status.conditions}'

Common causes:

Kubelet crashed
Disk pressure
Memory pressure
Network issues

45. How do you perform a zero-downtime upgrade of your application?

# 1. Configure rolling update strategy
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0  # Never reduce below desired count

# 2. Set readiness probe
readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

# 3. Configure PDB
apiVersion: policy/v1
kind: PodDisruptionBudget
spec:
  minAvailable: 2

# 4. Apply new version
kubectl set image deployment/web web=myapp:v2.0

# 5. Monitor rollout
kubectl rollout status deployment/web

46. How do you handle secrets rotation without restarting Pods?

Option 1: External Secrets with refresh

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
spec:
  refreshInterval: 1h

Option 2: Reloader controller Use Stakater Reloader to automatically restart Pods when ConfigMaps/Secrets change.

Option 3: Application-level reload Design application to watch for file changes and reload secrets without restart.

47. A deployment is running but requests are timing out. How do you diagnose?

# 1. Check if Pods are Ready
kubectl get pods -l app=myapp

# 2. Check Service endpoints
kubectl get endpoints myapp-service

# 3. Test connectivity from another Pod
kubectl run debug --rm -it --image=busybox -- sh
wget -qO- --timeout=5 myapp-service:80

# 4. Check Pod logs for errors
kubectl logs -l app=myapp --tail=100

# 5. Check Network Policies
kubectl get networkpolicies -A

# 6. Check if Ingress is configured correctly
kubectl describe ingress myapp-ingress

# 7. Check resource usage (CPU throttling?)
kubectl top pods -l app=myapp

48. How would you migrate a stateful application to Kubernetes?

Assess the application:
- Data persistence requirements
- Network identity needs
- Scaling characteristics
Choose the right workload type:
- StatefulSet for ordered deployment
- Operators for complex applications (databases)

Plan storage:

volumeClaimTemplates:
- metadata:
    name: data
  spec:
    accessModes: ["ReadWriteOnce"]
    storageClassName: fast-ssd
    resources:
      requests:
        storage: 100Gi

Handle data migration:
- Use backup/restore tools
- Set up replication before cutover
- Plan for rollback
Consider managed operators:
- CloudNativePG for PostgreSQL
- Strimzi for Kafka
- MongoDB Community Operator

49. How do you implement cost optimization in a Kubernetes cluster?

# 1. Right-size resources using VPA recommendations
kubectl get vpa -A

# 2. Identify over-provisioned Pods
kubectl top pods -A --sort-by=cpu

# 3. Set resource requests/limits on all workloads
kubectl get pods -A -o json | jq '.items[] | select(.spec.containers[].resources.requests == null) | .metadata.name'

Strategies:

Use Cluster Autoscaler to scale nodes down
Implement spot/preemptible instances for non-critical workloads
Use namespace resource quotas
Schedule batch jobs during off-peak hours
Use tools like Kubecost for visibility

See our guide on Kubernetes cost optimization.

50. Describe your approach to upgrading a production Kubernetes cluster.

Preparation:

# Review release notes
# Back up etcd
etcdctl snapshot save backup.db

# Document current versions
kubectl version
kubectl get nodes -o wide

Pre-flight checks:
- Verify all Pods healthy
- Check PodDisruptionBudgets
- Ensure backup/restore tested
- Plan rollback procedure
Upgrade control plane:
- One master at a time (HA clusters)
- Upgrade in order: API server → controller manager → scheduler

Upgrade worker nodes:

# Cordon node (prevent new Pods)
kubectl cordon node1

# Drain node (evict Pods)
kubectl drain node1 --ignore-daemonsets

# Upgrade kubelet and kubectl
# Uncordon node
kubectl uncordon node1

Post-upgrade:
- Verify all components healthy
- Update add-ons (CNI, CoreDNS, Ingress)
- Test critical applications
- Monitor for issues

Interview Preparation Tips

Before the Interview

Hands-on practice: Set up a local cluster with Minikube or Kind. Break things intentionally and fix them.
Review core concepts: Understand the “why” behind each resource type, not just syntax.
Know your experience: Be ready to discuss specific Kubernetes challenges you’ve solved.
Stay current: Review recent Kubernetes news and updates.

During the Interview

Think out loud: Explain your debugging process step-by-step.
Ask clarifying questions: “Is this a single-tenant or multi-tenant cluster?” “What CNI are they using?”
Acknowledge what you don’t know: It’s better to say “I’d need to look that up” than to guess incorrectly.
Connect to real experience: “In my last role, we handled this by…”

Build Kubernetes Expertise with Expert Guidance

Preparing for Kubernetes interviews—or building production-ready clusters—is easier with experienced guidance.

Our Kubernetes consulting services help teams:

Architect production clusters on AWS EKS, Azure AKS, or Google GKE
Implement security best practices from RBAC to Network Policies
Optimize costs with right-sizing and autoscaling
Train engineering teams on Kubernetes operations

We’ve helped organizations from startups to enterprises build reliable, scalable Kubernetes platforms.

Schedule a Kubernetes consultation →