Cluster Autoscaler vs Auto Scaling Group: We've Configured Both on EKS

If you have ever created an EKS node group and assumed your cluster would automatically scale based on pod demand, you are not alone. It is one of the most common misconceptions in the Kubernetes-on-AWS ecosystem. The reality is that Auto Scaling Groups (ASGs) and the Kubernetes Cluster Autoscaler (CA) serve fundamentally different roles, and understanding the distinction is critical to building a production-ready EKS cluster.

We have configured both Cluster Autoscaler and Auto Scaling Groups across 100+ EKS clusters for clients in healthcare, fintech, and e-commerce. In this post, we break down exactly how they work, how they work together, and when to use each approach—including the modern alternative, Karpenter, that bypasses ASGs entirely.

The #1 Misconception: ASGs Do NOT Auto-Scale Based on Pod Demand

Let us clear this up immediately: an Auto Scaling Group on its own will not scale your EKS nodes in response to Kubernetes pod scheduling needs. When you create an EKS managed node group, AWS creates an associated ASG behind the scenes. That ASG handles instance lifecycle—launching and terminating EC2 instances. But it has zero awareness of Kubernetes pods, resource requests, or scheduling constraints.

Without a Kubernetes-aware autoscaler like the Cluster Autoscaler or Karpenter, your ASG sits at whatever desired count you set and stays there. Pods will go into a Pending state with FailedScheduling events, and nothing in the AWS infrastructure layer will respond to that signal.

This is the key insight: the ASG is the muscle, but it needs a brain that understands Kubernetes. That brain is the Cluster Autoscaler.

Quick Comparison: Cluster Autoscaler vs Auto Scaling Group

Before going deeper, here is a side-by-side comparison:

Feature	Cluster Autoscaler (CA)	Auto Scaling Group (ASG)
What it is	Kubernetes controller (runs as a pod)	AWS infrastructure resource
Awareness	Kubernetes-aware (pods, resource requests, taints, affinity)	Infrastructure-aware (CPU/memory metrics, instance health)
Scaling trigger	Unschedulable pods or underutilized nodes	CloudWatch alarms, target tracking policies, or manual adjustment
Scale-up mechanism	Modifies ASG `DesiredCapacity`	Launches EC2 instances to meet desired count
Scale-down mechanism	Cordons, drains nodes, then reduces ASG `DesiredCapacity`	Terminates instances based on termination policies
Pod scheduling	Understands pod affinity, taints, tolerations, resource requests	No concept of pods or Kubernetes scheduling
Configuration	Kubernetes Deployment with CLI flags	AWS Console, CLI, CloudFormation, or Terraform
Scope	Cluster-level (across multiple ASGs/node groups)	Single ASG

The critical takeaway: CA and ASG are not alternatives to each other—they are complementary layers. CA makes the scaling decisions; ASG executes them at the infrastructure level.

How Cluster Autoscaler and ASGs Work Together

Understanding the end-to-end scaling workflow removes a lot of confusion. Here is exactly what happens when your cluster needs more capacity:

Scale-Up Flow

Pod scheduling fails — A new pod is created (via Deployment, Job, etc.) but the Kubernetes scheduler cannot find a node with sufficient CPU, memory, or matching taints/labels.
CA detects pending pods — The Cluster Autoscaler runs a scan loop (every 10 seconds by default) and identifies pods in Pending state with FailedScheduling events.
CA simulates scheduling — CA evaluates each node group to determine which one could accommodate the pending pods. It considers instance types, labels, taints, and resource capacity.
CA adjusts ASG DesiredCapacity — CA calls the AWS Auto Scaling API to increase the DesiredCapacity of the selected ASG.
ASG launches instances — The ASG provisions new EC2 instances based on its launch template or launch configuration.
Nodes join the cluster — The new instances run the EKS bootstrap script, register with the Kubernetes API server, and become Ready nodes.
Pods are scheduled — The Kubernetes scheduler places the pending pods onto the newly available nodes.

This entire process typically takes 2-5 minutes, depending on instance type and AMI caching.

Scale-Down Flow

CA identifies underutilized nodes — During its scan loop, CA checks if node utilization (based on resource requests, not actual usage) falls below the scale-down-utilization-threshold (default: 50%).
Cool-down period — The node must remain underutilized for scale-down-unneeded-time (default: 10 minutes).
CA verifies safe eviction — CA checks for pods with PodDisruptionBudgets, local storage, or the cluster-autoscaler.kubernetes.io/safe-to-evict: "false" annotation.
CA cordons and drains — The node is cordoned (no new pods scheduled) and existing pods are gracefully evicted.
CA reduces ASG DesiredCapacity — CA calls the ASG API to decrement the desired count.
ASG terminates the instance — The EC2 instance is terminated based on the ASG’s termination policy.

Why ASG Scaling Policies Conflict with Cluster Autoscaler

Here is a mistake we see regularly: teams configure both Cluster Autoscaler and ASG scaling policies (target tracking or step scaling) on the same node group. This creates a tug-of-war.

The Conflict

ASG target tracking policies scale based on CloudWatch metrics like average CPU utilization. If you set a target of 60% CPU utilization on your ASG, the ASG will try to add or remove instances to maintain that target—completely independent of what the Cluster Autoscaler is doing.

The result:

CA scales up the ASG to accommodate pending pods, increasing DesiredCapacity to 10.
ASG target tracking sees that average CPU across those 10 nodes is only 40%, decides the group is over-provisioned, and scales it back down to 7.
Pods go Pending again, CA scales back up, and the cycle repeats.

The Fix

Remove all ASG scaling policies when using Cluster Autoscaler. The CA should be the sole entity managing DesiredCapacity. The ASG’s role is reduced to instance provisioning—it should only define MinSize, MaxSize, and let CA control everything in between.

If you need metric-based scaling for non-Kubernetes workloads on the same ASG (which we do not recommend), use separate ASGs for Kubernetes and non-Kubernetes instances.

Deploying Cluster Autoscaler on EKS

Here is a production-ready Cluster Autoscaler deployment manifest with the key configuration flags we recommend:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
    spec:
      serviceAccountName: cluster-autoscaler
      priorityClassName: system-cluster-critical
      containers:
        - name: cluster-autoscaler
          image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.31.0
          command:
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=aws
            - --skip-nodes-with-local-storage=false
            - --expander=least-waste
            - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-cluster
            - --balance-similar-node-groups
            - --scale-down-utilization-threshold=0.5
            - --scale-down-unneeded-time=10m
            - --scale-down-delay-after-add=10m
            - --scan-interval=10s
            - --max-node-provision-time=15m
          resources:
            requests:
              cpu: 100m
              memory: 600Mi
            limits:
              cpu: 100m
              memory: 600Mi
          env:
            - name: AWS_REGION
              value: eu-west-1

Key Flags Explained

--node-group-auto-discovery: Discovers ASGs automatically by tag, so you do not need to hardcode ASG names. Tag your ASGs with k8s.io/cluster-autoscaler/enabled=true and k8s.io/cluster-autoscaler/<cluster-name>=owned.
--expander=least-waste: When multiple node groups can accommodate pending pods, choose the one that results in the least wasted resources. The Cluster Autoscaler GitHub repository documents all available expanders.
--balance-similar-node-groups: Keeps node counts balanced across node groups with identical scheduling properties—essential for multi-AZ deployments.
--scan-interval=10s: How frequently CA checks for pending pods. The default is 10 seconds, but you can increase this to reduce API load (see tuning section below).
--scale-down-unneeded-time=10m: A node must be underutilized for 10 minutes before CA considers it for removal.

Required ASG Tags

Your Auto Scaling Groups must have these tags for auto-discovery to work:

{
  "Tags": [
    {
      "Key": "k8s.io/cluster-autoscaler/enabled",
      "Value": "true"
    },
    {
      "Key": "k8s.io/cluster-autoscaler/my-cluster",
      "Value": "owned"
    },
    {
      "Key": "kubernetes.io/cluster/my-cluster",
      "Value": "owned"
    }
  ]
}

For scaling from zero (where no running nodes exist in the ASG), you also need template tags so CA knows what instance types the ASG provides:

{
  "Tags": [
    {
      "Key": "k8s.io/cluster-autoscaler/node-template/label/node.kubernetes.io/instance-type",
      "Value": "m5.xlarge"
    },
    {
      "Key": "k8s.io/cluster-autoscaler/node-template/resources/cpu",
      "Value": "4"
    },
    {
      "Key": "k8s.io/cluster-autoscaler/node-template/resources/memory",
      "Value": "16Gi"
    }
  ]
}

EKS Managed Node Groups: AWS-Managed ASGs

EKS Managed Node Groups simplify the ASG layer significantly. When you create a managed node group, AWS:

Creates and manages the underlying ASG automatically
Configures the launch template with the correct EKS-optimized AMI
Handles node draining during updates (rolling updates with configurable surge)
Tags the ASG for Cluster Autoscaler auto-discovery
Supports graceful node termination via the node termination handler

This means you get the ASG lifecycle management without having to configure ASGs directly. However, you still need Cluster Autoscaler or Karpenter for pod-aware scaling. The managed node group does not change this fundamental requirement.

For a deeper dive into node group architecture, see our guide on EKS architecture best practices.

Tuning Cluster Autoscaler for Production

Based on the AWS EKS Cluster Autoscaler best practices documentation, here are the tuning parameters that matter most.

Scan Interval Tradeoffs

The --scan-interval flag controls how often CA checks for pending pods. The default is 10 seconds, but since launching a new EC2 instance takes 2+ minutes anyway, a more relaxed interval may be appropriate:

Scan Interval	API Calls (relative)	Scale-Up Delay (relative)
10s (default)	1x	Baseline
30s	3x reduction	~19% slower
60s	6x reduction	~38% slower

For clusters with 500+ nodes, increasing the scan interval to 30-60 seconds significantly reduces AWS API throttling risk with minimal impact on scaling responsiveness.

Node Group Design

The AWS best practices are clear on this:

Prefer fewer node groups with many nodes over many node groups with few nodes. This has the single biggest impact on CA scalability.
All nodes in a group must have identical scheduling properties: same labels, taints, and resource profiles (CPU, memory, GPU).
Use Namespaces for workload isolation instead of separate node groups.
Define a single ASG spanning multiple Availability Zones rather than one ASG per AZ (unless you need EBS volume affinity).

MixedInstancePolicies

When using MixedInstancePolicies for cost optimization, ensure all instance types have similar CPU, memory, and GPU capacity. CA uses the first instance type in the policy for scheduling simulation. If subsequent types are smaller, pods may fail to schedule after scale-up because the actual instance has less capacity than expected.

Use the EC2 Instance Selector to find compatible instance types:

ec2-instance-selector --memory 16 --vcpus 4 --cpu-architecture x86_64 --gpus 0 -r eu-west-1

This returns instance types with matching resource profiles (e.g., m5.xlarge, m5a.xlarge, m5n.xlarge, m4.xlarge).

Overprovisioning Strategy

Overprovisioning ensures there is always spare capacity to schedule pods immediately, avoiding the 2-5 minute wait for new nodes. The formula from AWS documentation:

overprovisioned_nodes = (average_scale_up_frequency × node_launch_time) + number_of_AZs

Example: If you need a new node every 30 seconds and node launch takes 30 seconds, you need 1 overprovisioned node. Add 3 more if you run across 3 AZs for optimal zone selection with pod anti-affinity.

Overprovisioning is implemented using low-priority “pause” pods that occupy space and get preempted when real workloads arrive:

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: overprovisioning
value: -1
globalDefault: false
description: "Priority class for overprovisioning pods"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: overprovisioning
  namespace: kube-system
spec:
  replicas: 3
  selector:
    matchLabels:
      app: overprovisioning
  template:
    metadata:
      labels:
        app: overprovisioning
    spec:
      priorityClassName: overprovisioning
      containers:
        - name: pause
          image: registry.k8s.io/pause:3.9
          resources:
            requests:
              cpu: "2"
              memory: 4Gi

When a real pod needs scheduling, Kubernetes preempts the pause pod, which immediately frees up capacity. The pause pod then goes Pending, triggering CA to scale up a new node—but the real pod is already running.

Spot Instance Strategy with ASGs

Spot instances can reduce compute costs by up to 90%, but they require careful ASG configuration. Based on our experience and AWS best practices, follow these rules:

Separate On-Demand and Spot ASGs

Never mix On-Demand and Spot instances in the same ASG. They have fundamentally different scheduling properties:

Spot nodes should carry a taint (e.g., spotInstance=true:PreferNoSchedule) so workloads must explicitly tolerate interruption risk.
On-Demand nodes should run critical, interruption-sensitive workloads.

Use the Priority Expander for Spot Preference

Configure CA to prefer Spot node groups, falling back to On-Demand when Spot capacity is unavailable:

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-autoscaler-priority-expander
  namespace: kube-system
data:
  priorities: |-
    10:
      - .*on-demand.*
    50:
      - .*spot.*

CA will attempt to scale the Spot node group first. If no Spot capacity is available within --max-node-provision-time (default 15 minutes), it falls back to the On-Demand group.

Maximize Instance Diversity

Use MixedInstancePolicies with 10+ instance types across multiple families to maximize Spot pool access. More instance type diversity means lower interruption rates and better capacity availability.

Karpenter: The Modern Alternative That Bypasses ASGs

Karpenter takes a fundamentally different approach. Instead of managing ASGs, Karpenter provisions EC2 instances directly using the EC2 Fleet API. This eliminates the ASG layer entirely and brings several advantages:

Faster provisioning — Karpenter launches instances in under 60 seconds (vs. 2-5 minutes with CA + ASG).
No node group management — You define NodePool and EC2NodeClass resources instead of pre-configured ASGs.
Workload-aware instance selection — Karpenter selects the optimal instance type per pod, not per node group.
Automatic consolidation — Karpenter continuously right-sizes the cluster by replacing underutilized nodes with smaller, better-fitting instances.

Karpenter NodePool Example

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["on-demand", "spot"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c", "m", "r"]
        - key: karpenter.k8s.aws/instance-generation
          operator: Gt
          values: ["4"]
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
  limits:
    cpu: "1000"
    memory: 2000Gi
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 1m
---
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiSelectorTerms:
    - alias: al2023@latest
  role: KarpenterNodeRole-my-cluster
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: my-cluster
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: my-cluster
  tags:
    Environment: production

With this configuration, Karpenter handles everything: instance type selection, AZ placement, Spot vs. On-Demand decisions, and node lifecycle. No ASGs, no node groups, no launch templates.

When to Use Each Approach

Not every scenario calls for the same solution. Here is our recommendation based on the workloads we have seen:

ASG-Only Scaling (No CA, No Karpenter)

Use this when:

You are running non-Kubernetes workloads (traditional EC2 applications)
You have a fixed, predictable number of nodes
You use ASG scaling policies tied to application-level CloudWatch metrics (e.g., SQS queue depth)
Kubernetes pod scheduling is not a factor

Cluster Autoscaler + ASG

Use this when:

You have an existing EKS cluster with well-defined node groups
Your workloads have predictable instance type requirements
You need fine-grained control over node group composition
Your organization has existing Terraform/CloudFormation managing ASGs
You are running EKS managed node groups and want minimal migration effort

Karpenter (Recommended for New Clusters)

Use this when:

You are building a new EKS cluster or ready to migrate
You have diverse workloads with varying resource requirements
You want faster scaling (sub-60-second node provisioning)
You want automatic instance type selection and consolidation
You are comfortable with Karpenter’s NodePool/EC2NodeClass model

For organizations evaluating these options, our guide on Kubernetes cost optimization strategies covers the financial implications of each approach in detail.

Protecting Critical Workloads During Scale-Down

Regardless of which autoscaler you use, protect critical workloads from eviction during scale-down:

metadata:
  annotations:
    cluster-autoscaler.kubernetes.io/safe-to-evict: "false"

This annotation tells Cluster Autoscaler to never evict this pod’s node during scale-down. Use it for long-running batch jobs, ML training workloads, or stateful applications where interruption is costly.

Also configure PodDisruptionBudgets to ensure minimum availability during voluntary disruptions:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: critical-app-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: critical-app

Common Mistakes to Avoid

Based on our experience across hundreds of EKS deployments, these are the most frequent errors we encounter:

Assuming ASGs auto-scale based on pod demand. They do not. You need CA or Karpenter.
Running ASG scaling policies alongside Cluster Autoscaler. This creates scaling conflicts. Remove ASG policies and let CA control DesiredCapacity.
Creating too many small node groups. Each additional node group increases CA scan time and complexity. Consolidate where possible.
Mismatched instance types in MixedInstancePolicies. All instance types must have similar CPU, memory, and GPU. Use the EC2 Instance Selector to verify compatibility.
Mixing Spot and On-Demand in the same ASG. Use separate ASGs with taints on Spot nodes.
Not setting resource requests on pods. CA makes scaling decisions based on resource requests, not actual usage. Without requests, CA cannot determine if a node is underutilized.
Using a CA version that does not match the Kubernetes version. Cross-version compatibility is not tested. Always match the CA minor version to your cluster version.

Teams working with AWS managed services often benefit from our guidance on avoiding these pitfalls early in their EKS journey.

Summary

The relationship between Cluster Autoscaler and Auto Scaling Groups is not either/or—it is a layered architecture where each component has a specific role. The ASG manages EC2 instance lifecycle. The Cluster Autoscaler provides the Kubernetes-aware intelligence that tells the ASG when and how to scale. Without CA (or Karpenter), your ASG is just a static pool of instances that has no idea your pods are stuck in Pending.

For teams building new clusters, Karpenter offers a simpler, faster alternative that eliminates the ASG layer entirely. For existing clusters with well-established ASG infrastructure, Cluster Autoscaler remains a proven, reliable choice—as long as you configure it correctly.

If you are evaluating your EKS autoscaling strategy, our Kubernetes consulting team can help you design the right approach for your workloads, whether that means tuning your existing CA configuration or migrating to Karpenter.

Get Expert Help with EKS Autoscaling

Configuring Cluster Autoscaler and Auto Scaling Groups correctly on EKS requires deep understanding of both Kubernetes scheduling and AWS infrastructure.

Our team provides expert EKS consulting services to help you:

Configure Cluster Autoscaler and ASGs for optimal scaling on EKS
Migrate from CA+ASG to Karpenter for faster, more cost-efficient scaling
Implement Spot instance strategies with proper ASG configuration for up to 90% savings

We have architected and optimized EKS autoscaling for 100+ production clusters.

Get a free EKS autoscaling assessment →

Cluster Autoscaler vs Auto Scaling Group: We've Configured Both on EKS

The #1 Misconception: ASGs Do NOT Auto-Scale Based on Pod Demand

Quick Comparison: Cluster Autoscaler vs Auto Scaling Group

How Cluster Autoscaler and ASGs Work Together

Scale-Up Flow

Scale-Down Flow

Why ASG Scaling Policies Conflict with Cluster Autoscaler

The Conflict

The Fix

Deploying Cluster Autoscaler on EKS

Key Flags Explained

Required ASG Tags

EKS Managed Node Groups: AWS-Managed ASGs

Tuning Cluster Autoscaler for Production

Scan Interval Tradeoffs

Node Group Design

MixedInstancePolicies

Overprovisioning Strategy

Spot Instance Strategy with ASGs

Separate On-Demand and Spot ASGs

Use the Priority Expander for Spot Preference

Maximize Instance Diversity

Karpenter: The Modern Alternative That Bypasses ASGs

Karpenter NodePool Example

When to Use Each Approach

ASG-Only Scaling (No CA, No Karpenter)

Cluster Autoscaler + ASG

Karpenter (Recommended for New Clusters)

Protecting Critical Workloads During Scale-Down

Common Mistakes to Avoid

Summary

Get Expert Help with EKS Autoscaling

Cluster Autoscaler vs Karpenter: We Tested Both on 100+ Clusters

FinOps for Kubernetes: We Reduced Cluster Costs 45% (Playbook)

Cluster Autoscaler vs Horizontal Pod Autoscaler: We Run Both in Production

Cluster Autoscaler vs HPA vs VPA: We Tested All 3 (Here's What Works)

CloudNativePG 1.29 Features Most Production Teams Miss

Need help with Amazon EKS?

Tasrie IT Support

Start a conversation

The #1 Misconception: ASGs Do NOT Auto-Scale Based on Pod Demand

Quick Comparison: Cluster Autoscaler vs Auto Scaling Group

How Cluster Autoscaler and ASGs Work Together

Scale-Up Flow

Scale-Down Flow

Why ASG Scaling Policies Conflict with Cluster Autoscaler

The Conflict

The Fix

Deploying Cluster Autoscaler on EKS

Key Flags Explained

Required ASG Tags

EKS Managed Node Groups: AWS-Managed ASGs

Tuning Cluster Autoscaler for Production

Scan Interval Tradeoffs

Node Group Design

MixedInstancePolicies

Overprovisioning Strategy

Spot Instance Strategy with ASGs

Separate On-Demand and Spot ASGs

Use the Priority Expander for Spot Preference

Maximize Instance Diversity

Karpenter: The Modern Alternative That Bypasses ASGs

Karpenter NodePool Example

When to Use Each Approach

ASG-Only Scaling (No CA, No Karpenter)

Cluster Autoscaler + ASG

Karpenter (Recommended for New Clusters)

Protecting Critical Workloads During Scale-Down

Common Mistakes to Avoid

Summary

Get Expert Help with EKS Autoscaling

Related Articles

Cluster Autoscaler vs Karpenter: We Tested Both on 100+ Clusters

FinOps for Kubernetes: We Reduced Cluster Costs 45% (Playbook)

Cluster Autoscaler vs Horizontal Pod Autoscaler: We Run Both in Production

Cluster Autoscaler vs HPA vs VPA: We Tested All 3 (Here's What Works)

CloudNativePG 1.29 Features Most Production Teams Miss

Need help with Amazon EKS?

One Production Insight a Week

What you'll get

Subscribe to weekly insights

You're subscribed.

Tasrie IT Support

Start a conversation