~/blog/cluster-autoscaler-vs-auto-scaling-group-2026
zsh
KUBERNETES

Cluster Autoscaler vs Auto Scaling Group: We've Configured Both on EKS

Engineering Team 2026-03-09

If you have ever created an EKS node group and assumed your cluster would automatically scale based on pod demand, you are not alone. It is one of the most common misconceptions in the Kubernetes-on-AWS ecosystem. The reality is that Auto Scaling Groups (ASGs) and the Kubernetes Cluster Autoscaler (CA) serve fundamentally different roles, and understanding the distinction is critical to building a production-ready EKS cluster.

We have configured both Cluster Autoscaler and Auto Scaling Groups across 100+ EKS clusters for clients in healthcare, fintech, and e-commerce. In this post, we break down exactly how they work, how they work together, and when to use each approach—including the modern alternative, Karpenter, that bypasses ASGs entirely.

The #1 Misconception: ASGs Do NOT Auto-Scale Based on Pod Demand

Let us clear this up immediately: an Auto Scaling Group on its own will not scale your EKS nodes in response to Kubernetes pod scheduling needs. When you create an EKS managed node group, AWS creates an associated ASG behind the scenes. That ASG handles instance lifecycle—launching and terminating EC2 instances. But it has zero awareness of Kubernetes pods, resource requests, or scheduling constraints.

Without a Kubernetes-aware autoscaler like the Cluster Autoscaler or Karpenter, your ASG sits at whatever desired count you set and stays there. Pods will go into a Pending state with FailedScheduling events, and nothing in the AWS infrastructure layer will respond to that signal.

This is the key insight: the ASG is the muscle, but it needs a brain that understands Kubernetes. That brain is the Cluster Autoscaler.

Quick Comparison: Cluster Autoscaler vs Auto Scaling Group

Before going deeper, here is a side-by-side comparison:

FeatureCluster Autoscaler (CA)Auto Scaling Group (ASG)
What it isKubernetes controller (runs as a pod)AWS infrastructure resource
AwarenessKubernetes-aware (pods, resource requests, taints, affinity)Infrastructure-aware (CPU/memory metrics, instance health)
Scaling triggerUnschedulable pods or underutilized nodesCloudWatch alarms, target tracking policies, or manual adjustment
Scale-up mechanismModifies ASG DesiredCapacityLaunches EC2 instances to meet desired count
Scale-down mechanismCordons, drains nodes, then reduces ASG DesiredCapacityTerminates instances based on termination policies
Pod schedulingUnderstands pod affinity, taints, tolerations, resource requestsNo concept of pods or Kubernetes scheduling
ConfigurationKubernetes Deployment with CLI flagsAWS Console, CLI, CloudFormation, or Terraform
ScopeCluster-level (across multiple ASGs/node groups)Single ASG

The critical takeaway: CA and ASG are not alternatives to each other—they are complementary layers. CA makes the scaling decisions; ASG executes them at the infrastructure level.

How Cluster Autoscaler and ASGs Work Together

Understanding the end-to-end scaling workflow removes a lot of confusion. Here is exactly what happens when your cluster needs more capacity:

Scale-Up Flow

  1. Pod scheduling fails — A new pod is created (via Deployment, Job, etc.) but the Kubernetes scheduler cannot find a node with sufficient CPU, memory, or matching taints/labels.
  2. CA detects pending pods — The Cluster Autoscaler runs a scan loop (every 10 seconds by default) and identifies pods in Pending state with FailedScheduling events.
  3. CA simulates scheduling — CA evaluates each node group to determine which one could accommodate the pending pods. It considers instance types, labels, taints, and resource capacity.
  4. CA adjusts ASG DesiredCapacity — CA calls the AWS Auto Scaling API to increase the DesiredCapacity of the selected ASG.
  5. ASG launches instances — The ASG provisions new EC2 instances based on its launch template or launch configuration.
  6. Nodes join the cluster — The new instances run the EKS bootstrap script, register with the Kubernetes API server, and become Ready nodes.
  7. Pods are scheduled — The Kubernetes scheduler places the pending pods onto the newly available nodes.

This entire process typically takes 2-5 minutes, depending on instance type and AMI caching.

Scale-Down Flow

  1. CA identifies underutilized nodes — During its scan loop, CA checks if node utilization (based on resource requests, not actual usage) falls below the scale-down-utilization-threshold (default: 50%).
  2. Cool-down period — The node must remain underutilized for scale-down-unneeded-time (default: 10 minutes).
  3. CA verifies safe eviction — CA checks for pods with PodDisruptionBudgets, local storage, or the cluster-autoscaler.kubernetes.io/safe-to-evict: "false" annotation.
  4. CA cordons and drains — The node is cordoned (no new pods scheduled) and existing pods are gracefully evicted.
  5. CA reduces ASG DesiredCapacity — CA calls the ASG API to decrement the desired count.
  6. ASG terminates the instance — The EC2 instance is terminated based on the ASG’s termination policy.

Why ASG Scaling Policies Conflict with Cluster Autoscaler

Here is a mistake we see regularly: teams configure both Cluster Autoscaler and ASG scaling policies (target tracking or step scaling) on the same node group. This creates a tug-of-war.

The Conflict

ASG target tracking policies scale based on CloudWatch metrics like average CPU utilization. If you set a target of 60% CPU utilization on your ASG, the ASG will try to add or remove instances to maintain that target—completely independent of what the Cluster Autoscaler is doing.

The result:

  • CA scales up the ASG to accommodate pending pods, increasing DesiredCapacity to 10.
  • ASG target tracking sees that average CPU across those 10 nodes is only 40%, decides the group is over-provisioned, and scales it back down to 7.
  • Pods go Pending again, CA scales back up, and the cycle repeats.

The Fix

Remove all ASG scaling policies when using Cluster Autoscaler. The CA should be the sole entity managing DesiredCapacity. The ASG’s role is reduced to instance provisioning—it should only define MinSize, MaxSize, and let CA control everything in between.

If you need metric-based scaling for non-Kubernetes workloads on the same ASG (which we do not recommend), use separate ASGs for Kubernetes and non-Kubernetes instances.

Deploying Cluster Autoscaler on EKS

Here is a production-ready Cluster Autoscaler deployment manifest with the key configuration flags we recommend:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
    spec:
      serviceAccountName: cluster-autoscaler
      priorityClassName: system-cluster-critical
      containers:
        - name: cluster-autoscaler
          image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.31.0
          command:
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=aws
            - --skip-nodes-with-local-storage=false
            - --expander=least-waste
            - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-cluster
            - --balance-similar-node-groups
            - --scale-down-utilization-threshold=0.5
            - --scale-down-unneeded-time=10m
            - --scale-down-delay-after-add=10m
            - --scan-interval=10s
            - --max-node-provision-time=15m
          resources:
            requests:
              cpu: 100m
              memory: 600Mi
            limits:
              cpu: 100m
              memory: 600Mi
          env:
            - name: AWS_REGION
              value: eu-west-1

Key Flags Explained

  • --node-group-auto-discovery: Discovers ASGs automatically by tag, so you do not need to hardcode ASG names. Tag your ASGs with k8s.io/cluster-autoscaler/enabled=true and k8s.io/cluster-autoscaler/<cluster-name>=owned.
  • --expander=least-waste: When multiple node groups can accommodate pending pods, choose the one that results in the least wasted resources. The Cluster Autoscaler GitHub repository documents all available expanders.
  • --balance-similar-node-groups: Keeps node counts balanced across node groups with identical scheduling properties—essential for multi-AZ deployments.
  • --scan-interval=10s: How frequently CA checks for pending pods. The default is 10 seconds, but you can increase this to reduce API load (see tuning section below).
  • --scale-down-unneeded-time=10m: A node must be underutilized for 10 minutes before CA considers it for removal.

Required ASG Tags

Your Auto Scaling Groups must have these tags for auto-discovery to work:

{
  "Tags": [
    {
      "Key": "k8s.io/cluster-autoscaler/enabled",
      "Value": "true"
    },
    {
      "Key": "k8s.io/cluster-autoscaler/my-cluster",
      "Value": "owned"
    },
    {
      "Key": "kubernetes.io/cluster/my-cluster",
      "Value": "owned"
    }
  ]
}

For scaling from zero (where no running nodes exist in the ASG), you also need template tags so CA knows what instance types the ASG provides:

{
  "Tags": [
    {
      "Key": "k8s.io/cluster-autoscaler/node-template/label/node.kubernetes.io/instance-type",
      "Value": "m5.xlarge"
    },
    {
      "Key": "k8s.io/cluster-autoscaler/node-template/resources/cpu",
      "Value": "4"
    },
    {
      "Key": "k8s.io/cluster-autoscaler/node-template/resources/memory",
      "Value": "16Gi"
    }
  ]
}

EKS Managed Node Groups: AWS-Managed ASGs

EKS Managed Node Groups simplify the ASG layer significantly. When you create a managed node group, AWS:

  • Creates and manages the underlying ASG automatically
  • Configures the launch template with the correct EKS-optimized AMI
  • Handles node draining during updates (rolling updates with configurable surge)
  • Tags the ASG for Cluster Autoscaler auto-discovery
  • Supports graceful node termination via the node termination handler

This means you get the ASG lifecycle management without having to configure ASGs directly. However, you still need Cluster Autoscaler or Karpenter for pod-aware scaling. The managed node group does not change this fundamental requirement.

For a deeper dive into node group architecture, see our guide on EKS architecture best practices.

Tuning Cluster Autoscaler for Production

Based on the AWS EKS Cluster Autoscaler best practices documentation, here are the tuning parameters that matter most.

Scan Interval Tradeoffs

The --scan-interval flag controls how often CA checks for pending pods. The default is 10 seconds, but since launching a new EC2 instance takes 2+ minutes anyway, a more relaxed interval may be appropriate:

Scan IntervalAPI Calls (relative)Scale-Up Delay (relative)
10s (default)1xBaseline
30s3x reduction~19% slower
60s6x reduction~38% slower

For clusters with 500+ nodes, increasing the scan interval to 30-60 seconds significantly reduces AWS API throttling risk with minimal impact on scaling responsiveness.

Node Group Design

The AWS best practices are clear on this:

  • Prefer fewer node groups with many nodes over many node groups with few nodes. This has the single biggest impact on CA scalability.
  • All nodes in a group must have identical scheduling properties: same labels, taints, and resource profiles (CPU, memory, GPU).
  • Use Namespaces for workload isolation instead of separate node groups.
  • Define a single ASG spanning multiple Availability Zones rather than one ASG per AZ (unless you need EBS volume affinity).

MixedInstancePolicies

When using MixedInstancePolicies for cost optimization, ensure all instance types have similar CPU, memory, and GPU capacity. CA uses the first instance type in the policy for scheduling simulation. If subsequent types are smaller, pods may fail to schedule after scale-up because the actual instance has less capacity than expected.

Use the EC2 Instance Selector to find compatible instance types:

ec2-instance-selector --memory 16 --vcpus 4 --cpu-architecture x86_64 --gpus 0 -r eu-west-1

This returns instance types with matching resource profiles (e.g., m5.xlarge, m5a.xlarge, m5n.xlarge, m4.xlarge).

Overprovisioning Strategy

Overprovisioning ensures there is always spare capacity to schedule pods immediately, avoiding the 2-5 minute wait for new nodes. The formula from AWS documentation:

overprovisioned_nodes = (average_scale_up_frequency × node_launch_time) + number_of_AZs

Example: If you need a new node every 30 seconds and node launch takes 30 seconds, you need 1 overprovisioned node. Add 3 more if you run across 3 AZs for optimal zone selection with pod anti-affinity.

Overprovisioning is implemented using low-priority “pause” pods that occupy space and get preempted when real workloads arrive:

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: overprovisioning
value: -1
globalDefault: false
description: "Priority class for overprovisioning pods"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: overprovisioning
  namespace: kube-system
spec:
  replicas: 3
  selector:
    matchLabels:
      app: overprovisioning
  template:
    metadata:
      labels:
        app: overprovisioning
    spec:
      priorityClassName: overprovisioning
      containers:
        - name: pause
          image: registry.k8s.io/pause:3.9
          resources:
            requests:
              cpu: "2"
              memory: 4Gi

When a real pod needs scheduling, Kubernetes preempts the pause pod, which immediately frees up capacity. The pause pod then goes Pending, triggering CA to scale up a new node—but the real pod is already running.

Spot Instance Strategy with ASGs

Spot instances can reduce compute costs by up to 90%, but they require careful ASG configuration. Based on our experience and AWS best practices, follow these rules:

Separate On-Demand and Spot ASGs

Never mix On-Demand and Spot instances in the same ASG. They have fundamentally different scheduling properties:

  • Spot nodes should carry a taint (e.g., spotInstance=true:PreferNoSchedule) so workloads must explicitly tolerate interruption risk.
  • On-Demand nodes should run critical, interruption-sensitive workloads.

Use the Priority Expander for Spot Preference

Configure CA to prefer Spot node groups, falling back to On-Demand when Spot capacity is unavailable:

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-autoscaler-priority-expander
  namespace: kube-system
data:
  priorities: |-
    10:
      - .*on-demand.*
    50:
      - .*spot.*

CA will attempt to scale the Spot node group first. If no Spot capacity is available within --max-node-provision-time (default 15 minutes), it falls back to the On-Demand group.

Maximize Instance Diversity

Use MixedInstancePolicies with 10+ instance types across multiple families to maximize Spot pool access. More instance type diversity means lower interruption rates and better capacity availability.

Karpenter: The Modern Alternative That Bypasses ASGs

Karpenter takes a fundamentally different approach. Instead of managing ASGs, Karpenter provisions EC2 instances directly using the EC2 Fleet API. This eliminates the ASG layer entirely and brings several advantages:

  • Faster provisioning — Karpenter launches instances in under 60 seconds (vs. 2-5 minutes with CA + ASG).
  • No node group management — You define NodePool and EC2NodeClass resources instead of pre-configured ASGs.
  • Workload-aware instance selection — Karpenter selects the optimal instance type per pod, not per node group.
  • Automatic consolidation — Karpenter continuously right-sizes the cluster by replacing underutilized nodes with smaller, better-fitting instances.

Karpenter NodePool Example

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["on-demand", "spot"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c", "m", "r"]
        - key: karpenter.k8s.aws/instance-generation
          operator: Gt
          values: ["4"]
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
  limits:
    cpu: "1000"
    memory: 2000Gi
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 1m
---
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiSelectorTerms:
    - alias: al2023@latest
  role: KarpenterNodeRole-my-cluster
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: my-cluster
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: my-cluster
  tags:
    Environment: production

With this configuration, Karpenter handles everything: instance type selection, AZ placement, Spot vs. On-Demand decisions, and node lifecycle. No ASGs, no node groups, no launch templates.

When to Use Each Approach

Not every scenario calls for the same solution. Here is our recommendation based on the workloads we have seen:

ASG-Only Scaling (No CA, No Karpenter)

Use this when:

  • You are running non-Kubernetes workloads (traditional EC2 applications)
  • You have a fixed, predictable number of nodes
  • You use ASG scaling policies tied to application-level CloudWatch metrics (e.g., SQS queue depth)
  • Kubernetes pod scheduling is not a factor

Cluster Autoscaler + ASG

Use this when:

  • You have an existing EKS cluster with well-defined node groups
  • Your workloads have predictable instance type requirements
  • You need fine-grained control over node group composition
  • Your organization has existing Terraform/CloudFormation managing ASGs
  • You are running EKS managed node groups and want minimal migration effort

Use this when:

  • You are building a new EKS cluster or ready to migrate
  • You have diverse workloads with varying resource requirements
  • You want faster scaling (sub-60-second node provisioning)
  • You want automatic instance type selection and consolidation
  • You are comfortable with Karpenter’s NodePool/EC2NodeClass model

For organizations evaluating these options, our guide on Kubernetes cost optimization strategies covers the financial implications of each approach in detail.

Protecting Critical Workloads During Scale-Down

Regardless of which autoscaler you use, protect critical workloads from eviction during scale-down:

metadata:
  annotations:
    cluster-autoscaler.kubernetes.io/safe-to-evict: "false"

This annotation tells Cluster Autoscaler to never evict this pod’s node during scale-down. Use it for long-running batch jobs, ML training workloads, or stateful applications where interruption is costly.

Also configure PodDisruptionBudgets to ensure minimum availability during voluntary disruptions:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: critical-app-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: critical-app

Common Mistakes to Avoid

Based on our experience across hundreds of EKS deployments, these are the most frequent errors we encounter:

  1. Assuming ASGs auto-scale based on pod demand. They do not. You need CA or Karpenter.
  2. Running ASG scaling policies alongside Cluster Autoscaler. This creates scaling conflicts. Remove ASG policies and let CA control DesiredCapacity.
  3. Creating too many small node groups. Each additional node group increases CA scan time and complexity. Consolidate where possible.
  4. Mismatched instance types in MixedInstancePolicies. All instance types must have similar CPU, memory, and GPU. Use the EC2 Instance Selector to verify compatibility.
  5. Mixing Spot and On-Demand in the same ASG. Use separate ASGs with taints on Spot nodes.
  6. Not setting resource requests on pods. CA makes scaling decisions based on resource requests, not actual usage. Without requests, CA cannot determine if a node is underutilized.
  7. Using a CA version that does not match the Kubernetes version. Cross-version compatibility is not tested. Always match the CA minor version to your cluster version.

Teams working with AWS managed services often benefit from our guidance on avoiding these pitfalls early in their EKS journey.

Summary

The relationship between Cluster Autoscaler and Auto Scaling Groups is not either/or—it is a layered architecture where each component has a specific role. The ASG manages EC2 instance lifecycle. The Cluster Autoscaler provides the Kubernetes-aware intelligence that tells the ASG when and how to scale. Without CA (or Karpenter), your ASG is just a static pool of instances that has no idea your pods are stuck in Pending.

For teams building new clusters, Karpenter offers a simpler, faster alternative that eliminates the ASG layer entirely. For existing clusters with well-established ASG infrastructure, Cluster Autoscaler remains a proven, reliable choice—as long as you configure it correctly.

If you are evaluating your EKS autoscaling strategy, our Kubernetes consulting team can help you design the right approach for your workloads, whether that means tuning your existing CA configuration or migrating to Karpenter.


Get Expert Help with EKS Autoscaling

Configuring Cluster Autoscaler and Auto Scaling Groups correctly on EKS requires deep understanding of both Kubernetes scheduling and AWS infrastructure.

Our team provides expert EKS consulting services to help you:

  • Configure Cluster Autoscaler and ASGs for optimal scaling on EKS
  • Migrate from CA+ASG to Karpenter for faster, more cost-efficient scaling
  • Implement Spot instance strategies with proper ASG configuration for up to 90% savings

We have architected and optimized EKS autoscaling for 100+ production clusters.

Get a free EKS autoscaling assessment →

Continue exploring these related topics

$ suggest --service

Need help with Amazon EKS?

We manage 100+ EKS clusters in production. Let us handle yours.

Get started
Chat with real humans
Chat on WhatsApp