Migrating from traditional virtual machines to Kubernetes represents a significant architectural shift. Done well, it delivers improved scalability, deployment velocity, and operational efficiency. Done poorly, it creates complexity without benefits.
This guide covers the practical steps for migrating on-premise applications to managed Kubernetes services like Amazon EKS, Azure AKS, or Google GKE.
Why Migrate to Kubernetes?
Before starting migration, ensure Kubernetes solves real problems for your organization.
Valid Reasons to Migrate
Scalability requirements:
- Applications need to scale rapidly based on demand
- Current infrastructure cannot handle traffic spikes
- Manual scaling is too slow for business needs
Deployment velocity:
- Release cycles are too slow
- Deployments are risky and require downtime
- Rollbacks are difficult or impossible
Resource efficiency:
- VMs are underutilized
- Cannot bin-pack workloads efficiently
- Over-provisioning to handle peak loads
Developer productivity:
- Environment inconsistencies cause issues
- Developers wait for infrastructure provisioning
- Local development differs from production
Invalid Reasons to Migrate
Resume-driven development: “Kubernetes is popular” is not a migration justification.
Solving organizational problems: Kubernetes does not fix poor communication or unclear ownership.
Following competitors: What works for others may not fit your situation.
Assessment Phase
Application Portfolio Analysis
Evaluate each application for Kubernetes readiness:
| Application | Stateless | 12-Factor | Dependencies | Complexity | Priority |
|-------------|-----------|-----------|--------------|------------|----------|
| API Gateway | Yes | Yes | Redis | Low | High |
| User Service | Yes | Partial | PostgreSQL | Medium | High |
| Legacy CRM | No | No | Oracle, LDAP | High | Low |
| Batch Jobs | Yes | Yes | S3 | Low | Medium |
12-Factor App checklist:
- Configuration via environment variables
- Stateless processes
- Port binding
- Disposable processes (fast startup/shutdown)
- Dev/prod parity
- Logs as event streams
Applications meeting most criteria are good migration candidates.
Infrastructure Requirements
Document current infrastructure for capacity planning:
# Collect VM metrics
#!/bin/bash
for vm in $(get_vm_list); do
echo "=== $vm ==="
echo "CPU Cores: $(ssh $vm nproc)"
echo "Memory: $(ssh $vm free -h | grep Mem | awk '{print $2}')"
echo "Disk: $(ssh $vm df -h / | tail -1 | awk '{print $2}')"
echo "Avg CPU (7d): $(get_avg_cpu $vm 7d)"
echo "Avg Memory (7d): $(get_avg_memory $vm 7d)"
echo "Peak CPU (7d): $(get_peak_cpu $vm 7d)"
echo ""
done
Use this data to right-size Kubernetes resource requests and limits.
Dependency Mapping
Map all application dependencies:
# dependency-map.yaml
applications:
user-service:
type: api
language: java
dependencies:
databases:
- name: user-db
type: postgresql
version: "14"
caches:
- name: session-cache
type: redis
version: "7"
services:
- name: auth-service
protocol: grpc
port: 50051
external:
- name: stripe-api
url: https://api.stripe.com
order-service:
type: api
language: nodejs
dependencies:
databases:
- name: order-db
type: postgresql
version: "14"
queues:
- name: order-events
type: rabbitmq
services:
- name: user-service
protocol: http
port: 8080
- name: inventory-service
protocol: http
port: 8080
Containerization
Dockerfile Best Practices
Create efficient, secure container images:
# Example: Java application
# Use multi-stage builds for smaller images
FROM maven:3.9-eclipse-temurin-21 AS builder
WORKDIR /app
COPY pom.xml .
# Cache dependencies
RUN mvn dependency:go-offline
COPY src ./src
RUN mvn package -DskipTests
# Production image
FROM eclipse-temurin:21-jre-alpine
# Security: Run as non-root
RUN addgroup -g 1001 appgroup && \
adduser -u 1001 -G appgroup -D appuser
WORKDIR /app
# Copy only the built artifact
COPY --from=builder /app/target/*.jar app.jar
# Set ownership
RUN chown -R appuser:appgroup /app
USER appuser
# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=60s \
CMD wget -q --spider http://localhost:8080/health || exit 1
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]
# Example: Node.js application
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Production image
FROM node:20-alpine
RUN addgroup -g 1001 nodejs && \
adduser -u 1001 -G nodejs -D nodejs
WORKDIR /app
COPY --from=builder --chown=nodejs:nodejs /app/dist ./dist
COPY --from=builder --chown=nodejs:nodejs /app/node_modules ./node_modules
COPY --from=builder --chown=nodejs:nodejs /app/package.json ./
USER nodejs
EXPOSE 3000
CMD ["node", "dist/server.js"]
Container Registry Setup
Set up a container registry for your images:
AWS ECR:
# Create repository
aws ecr create-repository \
--repository-name my-app/user-service \
--image-scanning-configuration scanOnPush=true
# Login and push
aws ecr get-login-password --region us-east-1 | \
docker login --username AWS --password-stdin 123456789.dkr.ecr.us-east-1.amazonaws.com
docker tag user-service:latest 123456789.dkr.ecr.us-east-1.amazonaws.com/my-app/user-service:latest
docker push 123456789.dkr.ecr.us-east-1.amazonaws.com/my-app/user-service:latest
Azure ACR:
# Create registry
az acr create --resource-group mygroup --name myregistry --sku Standard
# Login and push
az acr login --name myregistry
docker tag user-service:latest myregistry.azurecr.io/user-service:latest
docker push myregistry.azurecr.io/user-service:latest
GCP Artifact Registry:
# Create repository
gcloud artifacts repositories create my-repo \
--repository-format=docker \
--location=us-central1
# Configure Docker and push
gcloud auth configure-docker us-central1-docker.pkg.dev
docker tag user-service:latest us-central1-docker.pkg.dev/my-project/my-repo/user-service:latest
docker push us-central1-docker.pkg.dev/my-project/my-repo/user-service:latest
Kubernetes Cluster Setup
Managed Kubernetes Selection
Choose based on your cloud provider and requirements:
| Factor | EKS | AKS | GKE |
|---|---|---|---|
| Control plane cost | $0.10/hour | Free | Free (standard) |
| AWS integration | Native | Limited | Limited |
| Azure integration | Limited | Native | Limited |
| GCP integration | Limited | Limited | Native |
| Autopilot mode | No | No | Yes |
Cluster Architecture
Design your cluster for production:
# Terraform example: EKS cluster
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 19.0"
cluster_name = "production"
cluster_version = "1.29"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
# Enable cluster logging
cluster_enabled_log_types = [
"api", "audit", "authenticator", "controllerManager", "scheduler"
]
# Node groups
eks_managed_node_groups = {
# General workloads
general = {
min_size = 3
max_size = 10
desired_size = 3
instance_types = ["m6i.large"]
capacity_type = "ON_DEMAND"
labels = {
workload-type = "general"
}
}
# Spot instances for non-critical workloads
spot = {
min_size = 0
max_size = 20
desired_size = 2
instance_types = ["m6i.large", "m5.large", "m5a.large"]
capacity_type = "SPOT"
labels = {
workload-type = "spot"
}
taints = [{
key = "spot"
value = "true"
effect = "NO_SCHEDULE"
}]
}
}
}
For detailed cluster setup guidance, see our Kubernetes consulting services.
Essential Add-ons
Install necessary cluster components:
# Cluster Autoscaler
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: cluster-autoscaler
template:
metadata:
labels:
app: cluster-autoscaler
spec:
serviceAccountName: cluster-autoscaler
containers:
- name: cluster-autoscaler
image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.29.0
command:
- ./cluster-autoscaler
- --v=4
- --stderrthreshold=info
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --expander=least-waste
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/production
# Metrics Server for HPA
apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
spec:
containers:
- name: metrics-server
image: registry.k8s.io/metrics-server/metrics-server:v0.6.4
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
Application Migration
Kubernetes Manifests
Convert VM-based applications to Kubernetes resources:
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: user-service
labels:
app: user-service
spec:
replicas: 3
selector:
matchLabels:
app: user-service
template:
metadata:
labels:
app: user-service
spec:
containers:
- name: user-service
image: myregistry/user-service:v1.0.0
ports:
- containerPort: 8080
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: user-service-secrets
key: database-url
- name: REDIS_HOST
value: "redis-master.default.svc.cluster.local"
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: user-service
topologyKey: kubernetes.io/hostname
---
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: user-service
spec:
selector:
app: user-service
ports:
- port: 80
targetPort: 8080
type: ClusterIP
---
# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: user-service
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: user-service
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Database Migration
Migrate databases to managed services or Kubernetes:
Option 1: Managed database (recommended)
# Terraform: RDS PostgreSQL
resource "aws_db_instance" "user_db" {
identifier = "user-service-db"
engine = "postgres"
engine_version = "14"
instance_class = "db.r6g.large"
allocated_storage = 100
storage_encrypted = true
db_name = "users"
username = "admin"
password = var.db_password
vpc_security_group_ids = [aws_security_group.rds.id]
db_subnet_group_name = aws_db_subnet_group.main.name
backup_retention_period = 7
multi_az = true
skip_final_snapshot = false
}
Option 2: Database in Kubernetes (for dev/test)
# PostgreSQL StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgresql
spec:
serviceName: postgresql
replicas: 1
selector:
matchLabels:
app: postgresql
template:
metadata:
labels:
app: postgresql
spec:
containers:
- name: postgresql
image: postgres:14
ports:
- containerPort: 5432
env:
- name: POSTGRES_DB
value: users
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: postgresql-secrets
key: username
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgresql-secrets
key: password
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: gp3
resources:
requests:
storage: 100Gi
Secrets Management
Migrate secrets from on-premise vaults to Kubernetes:
# External Secrets Operator with AWS Secrets Manager
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: aws-secrets
spec:
provider:
aws:
service: SecretsManager
region: us-east-1
auth:
jwt:
serviceAccountRef:
name: external-secrets-sa
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: user-service-secrets
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets
kind: SecretStore
target:
name: user-service-secrets
data:
- secretKey: database-url
remoteRef:
key: production/user-service
property: database_url
- secretKey: api-key
remoteRef:
key: production/user-service
property: api_key
CI/CD Pipeline Migration
Set up automated deployment pipelines:
# GitHub Actions for Kubernetes deployment
name: Deploy to Kubernetes
on:
push:
branches: [main]
env:
REGISTRY: 123456789.dkr.ecr.us-east-1.amazonaws.com
IMAGE_NAME: user-service
jobs:
build:
runs-on: ubuntu-latest
outputs:
image-tag: ${{ steps.meta.outputs.tags }}
steps:
- uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
- name: Login to ECR
uses: aws-actions/amazon-ecr-login@v2
- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
deploy:
needs: build
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
- name: Update kubeconfig
run: aws eks update-kubeconfig --name production
- name: Deploy to Kubernetes
run: |
kubectl set image deployment/user-service \
user-service=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
kubectl rollout status deployment/user-service
For GitOps-based deployments, consider implementing ArgoCD for declarative continuous delivery.
Observability Setup
Monitoring with Prometheus
Deploy comprehensive monitoring:
# Prometheus configuration
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
namespace: monitoring
spec:
serviceAccountName: prometheus
serviceMonitorSelector:
matchLabels:
team: platform
podMonitorSelector:
matchLabels:
team: platform
resources:
requests:
memory: 2Gi
cpu: 1
storage:
volumeClaimTemplate:
spec:
storageClassName: gp3
resources:
requests:
storage: 100Gi
---
# ServiceMonitor for application
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: user-service
labels:
team: platform
spec:
selector:
matchLabels:
app: user-service
endpoints:
- port: http
path: /metrics
interval: 30s
For production monitoring setup, see our Prometheus consulting services.
Logging with Fluentd
Collect and forward logs:
# Fluentd DaemonSet
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
namespace: logging
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
serviceAccountName: fluentd
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1.16-debian-cloudwatch
env:
- name: AWS_REGION
value: "us-east-1"
- name: LOG_GROUP_NAME
value: "/kubernetes/production"
volumeMounts:
- name: varlog
mountPath: /var/log
- name: containers
mountPath: /var/lib/docker/containers
readOnly: true
volumes:
- name: varlog
hostPath:
path: /var/log
- name: containers
hostPath:
path: /var/lib/docker/containers
Migration Execution
Phased Migration Approach
Migrate in waves to reduce risk:
Wave 1: Stateless, non-critical applications
- Internal tools
- Development environments
- Non-production workloads
Wave 2: Stateless production applications
- APIs without state
- Frontend applications
- Microservices
Wave 3: Stateful applications
- Applications with databases
- Message queue consumers
- Session-dependent services
Wave 4: Critical infrastructure
- Core business applications
- High-traffic services
- Compliance-sensitive workloads
Traffic Migration
Use gradual traffic shifting:
# Nginx Ingress for traffic splitting
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: user-service
annotations:
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "20" # 20% to K8s
spec:
rules:
- host: api.example.com
http:
paths:
- path: /users
pathType: Prefix
backend:
service:
name: user-service
port:
number: 80
Increase canary weight gradually:
- 10% → Monitor for 1 hour
- 25% → Monitor for 4 hours
- 50% → Monitor for 24 hours
- 100% → Complete migration
Rollback Plan
Document and test rollback procedures:
#!/bin/bash
# rollback.sh - Revert to VM-based deployment
# 1. Update DNS to point back to load balancer
aws route53 change-resource-record-sets \
--hosted-zone-id Z123456 \
--change-batch '{
"Changes": [{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "api.example.com",
"Type": "A",
"AliasTarget": {
"HostedZoneId": "Z789",
"DNSName": "vm-load-balancer.elb.amazonaws.com",
"EvaluateTargetHealth": true
}
}
}]
}'
# 2. Scale down Kubernetes deployment
kubectl scale deployment user-service --replicas=0
# 3. Verify traffic is back on VMs
curl -I https://api.example.com/health
echo "Rollback complete. Monitor VM metrics."
Post-Migration Optimization
Resource Right-Sizing
Analyze actual usage and adjust:
# Check resource usage
kubectl top pods -n production
# Analyze with metrics
kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/production/pods | jq '.items[] | {name: .metadata.name, cpu: .containers[].usage.cpu, memory: .containers[].usage.memory}'
Use Vertical Pod Autoscaler for recommendations:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: user-service-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: user-service
updatePolicy:
updateMode: "Off" # Recommendation only
Cost Optimization
Implement Kubernetes cost management:
- Use Spot/Preemptible instances for non-critical workloads
- Right-size node groups based on actual usage
- Implement pod priority and preemption
- Use cluster autoscaler to scale down unused capacity
For ongoing cloud cost optimization, see our AWS cost management services.
Summary
Migrating from on-premise VMs to Kubernetes requires:
- Thorough assessment - Evaluate applications for Kubernetes readiness
- Proper containerization - Build efficient, secure container images
- Production-ready clusters - Set up managed Kubernetes with proper configuration
- Comprehensive observability - Implement monitoring, logging, and alerting
- Phased migration - Reduce risk with gradual traffic shifting
- Continuous optimization - Right-size resources and optimize costs
The effort is significant, but organizations that complete the migration successfully gain improved scalability, faster deployments, and better resource efficiency.
Need Help with Kubernetes Migration?
We guide organizations through Kubernetes migrations from on-premise infrastructure to cloud-native platforms. Our Kubernetes consulting services cover assessment, architecture design, migration execution, and ongoing optimization for EKS, AKS, and GKE.
Book a free 30-minute consultation to discuss your Kubernetes migration project.