What Is Kubernetes Tolerations and Taints: Pod Scheduling Mastery

Understanding what is kubernetes tolerations and taints is fundamental to controlling where pods run in your cluster. These mechanisms work together to ensure pods land on the right nodes, prevent resource contention, and maintain workload isolation. According to the CNCF Annual Survey 2023, 96% of organizations are now using or evaluating Kubernetes, making these scheduling primitives essential knowledge for platform teams.

Taints and tolerations act as a repel-and-attract system. Nodes can be “tainted” to repel pods, while pods can declare “tolerations” to override those taints and schedule anyway. This gives you fine-grained control over pod placement without manually assigning nodes.

What Are Kubernetes Taints?

Taints are properties applied to nodes that repel pods unless those pods explicitly tolerate the taint. Think of a taint as a “keep out” sign that only certain pods can ignore.

A taint consists of three components:

Key: An identifier for the taint (e.g., gpu, dedicated, environment)
Value: Optional data associated with the key (e.g., true, production, team-a)
Effect: What happens to pods that don’t tolerate the taint

The three taint effects determine scheduling behavior:

NoSchedule: Prevents new pods from scheduling on the node unless they tolerate the taint. Existing pods remain unaffected.
PreferNoSchedule: Soft version of NoSchedule. Kubernetes tries to avoid scheduling pods here but will do so if no other options exist.
NoExecute: Prevents new pods AND evicts existing pods that don’t tolerate the taint. This is the most aggressive effect.

Here’s how to apply a taint to a node:

kubectl taint nodes node1 gpu=true:NoSchedule

This command taints node1 with key gpu, value true, and effect NoSchedule. Now only pods with a matching toleration can schedule on this node.

Common use cases for taints include:

Dedicating nodes to specific workloads (GPU nodes, high-memory nodes)
Isolating production from non-production workloads
Preventing pods from scheduling on nodes undergoing maintenance
Creating multi-tenant clusters with team-specific nodes

What Are Kubernetes Tolerations?

Tolerations are properties added to pod specifications that allow (but don’t require) pods to schedule on nodes with matching taints. A toleration says “I can tolerate this taint and schedule on nodes that have it.”

A toleration must match the taint’s key, value, and effect to work. Here’s a pod with a toleration:

apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  containers:
  - name: cuda-container
    image: nvidia/cuda:11.0-base
  tolerations:
  - key: "gpu"
    operator: "Equal"
    value: "true"
    effect: "NoSchedule"

This pod tolerates the gpu=true:NoSchedule taint, allowing it to schedule on nodes with that taint.

Tolerations support two operators:

Equal: The toleration’s key and value must exactly match the taint
Exists: Only the key must match; value is ignored

The Exists operator is useful for tolerating any value:

tolerations:
- key: "dedicated"
  operator: "Exists"
  effect: "NoSchedule"

This tolerates any taint with key dedicated, regardless of value.

How Do Taints and Tolerations Work Together?

The taint-toleration system follows these rules:

No taint on node: Any pod can schedule there
Taint on node, no toleration on pod: Pod cannot schedule (for NoSchedule/PreferNoSchedule) or gets evicted (for NoExecute)
Taint on node, matching toleration on pod: Pod can schedule
Multiple taints on node: Pod must tolerate ALL taints to schedule

Let’s walk through a practical example. Suppose you have GPU nodes that should only run machine learning workloads:

Step 1: Taint the GPU nodes

kubectl taint nodes gpu-node-1 workload=ml:NoSchedule
kubectl taint nodes gpu-node-2 workload=ml:NoSchedule

Step 2: Add tolerations to ML pods

apiVersion: v1
kind: Pod
metadata:
  name: tensorflow-training
spec:
  containers:
  - name: tf-container
    image: tensorflow/tensorflow:latest-gpu
  tolerations:
  - key: "workload"
    operator: "Equal"
    value: "ml"
    effect: "NoSchedule"

Step 3: Regular pods without tolerations won’t schedule on GPU nodes, keeping resources available for ML workloads.

For advanced scheduling strategies, our Kubernetes consulting services can help design node pools and taints that match your workload requirements.

When Should You Use Taints and Tolerations?

Taints and tolerations solve specific scheduling challenges:

Dedicated Hardware Nodes

When you have specialized hardware (GPUs, FPGAs, high-memory nodes), taints prevent general workloads from consuming these expensive resources:

kubectl taint nodes gpu-node hardware=gpu:NoSchedule
kubectl taint nodes highmem-node hardware=highmem:NoSchedule

Multi-Tenant Isolation

In shared clusters, taints create logical boundaries between teams or customers:

kubectl taint nodes team-a-node tenant=team-a:NoExecute
kubectl taint nodes team-b-node tenant=team-b:NoExecute

Each team’s pods tolerate only their own taint, ensuring workload isolation.

Environment Separation

Separate production from staging workloads within the same cluster:

kubectl taint nodes prod-node-1 environment=production:NoSchedule
kubectl taint nodes staging-node-1 environment=staging:NoSchedule

Node Maintenance and Draining

Before maintenance, taint nodes with NoExecute to gracefully evict pods:

kubectl taint nodes node1 maintenance=true:NoExecute

Pods without a matching toleration will terminate and reschedule elsewhere. According to Kubernetes documentation, this is the recommended approach for controlled node draining.

Common Taint and Toleration Patterns

Kubernetes itself uses taints for critical system functions:

Node Not Ready Taint

When a node becomes unhealthy, Kubernetes automatically taints it:

node.kubernetes.io/not-ready:NoExecute

This evicts pods from failing nodes. Most pods have a default toleration for this taint with a tolerationSeconds value, allowing temporary network hiccups without immediate eviction. Learn more about handling these scenarios in our guide on Kubernetes node not ready troubleshooting.

Master Node Taint

Control plane nodes have a built-in taint to prevent user workloads:

node-role.kubernetes.io/master:NoSchedule

Only system pods (kube-proxy, CoreDNS) tolerate this taint.

Custom Application Taints

You can create application-specific taints for canary deployments:

apiVersion: v1
kind: Pod
metadata:
  name: canary-pod
spec:
  containers:
  - name: app
    image: myapp:canary
  tolerations:
  - key: "deployment"
    value: "canary"
    effect: "NoSchedule"

Taint canary nodes with deployment=canary:NoSchedule to route only canary traffic there.

Taints vs Node Affinity: What’s the Difference?

Both taints/tolerations and node affinity control pod placement, but they work differently:

Taints and Tolerations:

Push model: Nodes repel pods by default
Negative selection: “Don’t schedule here unless…”
Node-centric: Nodes control what can run on them
Best for: Keeping pods OFF nodes

Node Affinity:

Pull model: Pods attract themselves to nodes
Positive selection: “I want to run here”
Pod-centric: Pods choose where to run
Best for: Putting pods ON specific nodes

Often you’ll use both together. Taint a GPU node to keep regular pods off, then use node affinity in ML pods to prefer that GPU node over others.

For comprehensive cluster design patterns, explore our Kubernetes migration strategy guide.

How to Remove Taints from Nodes

Removing a taint uses the same command with a minus sign:

kubectl taint nodes node1 gpu=true:NoSchedule-

The trailing - removes the taint. To remove all taints with a specific key:

kubectl taint nodes node1 gpu-

This removes all taints with key gpu, regardless of value or effect.

Toleration Seconds: Time-Based Evictions

For NoExecute taints, you can add tolerationSeconds to delay eviction:

tolerations:
- key: "node.kubernetes.io/unreachable"
  operator: "Exists"
  effect: "NoExecute"
  tolerationSeconds: 300

This pod tolerates an unreachable node for 300 seconds (5 minutes) before evicting. This prevents flapping during brief network issues.

Kubernetes adds default tolerations for common node conditions:

node.kubernetes.io/not-ready:NoExecute with 300s tolerance
node.kubernetes.io/unreachable:NoExecute with 300s tolerance

You can override these defaults in your pod specs.

Best Practices for Taints and Tolerations

Follow these guidelines for maintainable clusters:

Use Descriptive Taint Keys

Choose clear, consistent naming:

Good: workload=database, hardware=gpu, environment=production
Bad: special=true, x=y, node1=abc

Document Your Taints

Maintain a registry of taints used in your cluster. Include:

Taint key/value/effect
Purpose and use case
Which teams or applications use it
When it was added and by whom

Start with PreferNoSchedule

For new taints, start with PreferNoSchedule to test behavior before enforcing with NoSchedule. This prevents accidental pod scheduling failures.

Combine with Resource Requests

Taints prevent scheduling, but don’t guarantee resources. Always set resource requests:

resources:
  requests:
    nvidia.com/gpu: 1
    memory: "16Gi"

Audit Tolerations Regularly

Periodically review which pods tolerate which taints. Orphaned tolerations can lead to unexpected scheduling.

Test Eviction Behavior

Before using NoExecute in production, test eviction timing and pod disruption budgets in staging. According to a Datadog report on Kubernetes adoption, misconfigured eviction policies are a leading cause of service disruptions.

For production-ready Kubernetes configurations, our Kubernetes production support team can audit your taints and tolerations.

Troubleshooting Taint and Toleration Issues

Common problems and solutions:

Pod Stuck in Pending State

Symptom: Pod shows Pending status indefinitely.

Diagnosis:

kubectl describe pod <pod-name>

Look for events like “0/5 nodes are available: 5 node(s) had taint {key: value}, that the pod didn’t tolerate.”

Solution: Add matching toleration to pod spec or remove taint from nodes.

Pods Evicted Unexpectedly

Symptom: Running pods suddenly terminate.

Diagnosis: Check if a NoExecute taint was added:

kubectl describe node <node-name> | grep Taints

Solution: Either add tolerations with tolerationSeconds or remove the taint.

Toleration Not Working

Symptom: Pod has toleration but still won’t schedule.

Common causes:

Typo in key, value, or effect
Wrong operator (Equal vs Exists)
Other scheduling constraints (affinity, resource limits)

Solution: Verify exact taint-toleration match:

kubectl get nodes -o json | jq '.items[].spec.taints'
kubectl get pod <pod-name> -o json | jq '.spec.tolerations'

Frequently Asked Questions

What is the difference between taints and tolerations in Kubernetes?

Taints are applied to nodes to repel pods, while tolerations are added to pods to allow them to schedule on tainted nodes. Taints push pods away; tolerations pull pods back.

Can a pod have multiple tolerations?

Yes, pods can have unlimited tolerations. If a node has multiple taints, the pod must tolerate all of them to schedule there.

Do tolerations guarantee pod placement?

No. Tolerations only allow scheduling on tainted nodes. They don’t force it. Use node affinity or node selectors to prefer specific nodes.

What happens if I taint a node with NoExecute and pods are already running?

Pods without matching tolerations are immediately evicted. Pods with tolerations and tolerationSeconds are evicted after that duration expires.

Can I use taints for autoscaling?

Yes. Taint new nodes during scale-up to control which pods land there first. Cluster autoscaler respects taints when determining if a node can satisfy pending pods. For cost optimization strategies, see our Kubernetes cost optimization guide.

Conclusion

Understanding what is kubernetes tolerations and taints empowers you to build sophisticated scheduling policies that match your operational needs. Taints let nodes declare what they won’t run, while tolerations let pods override those restrictions. Together, they enable dedicated hardware pools, multi-tenant isolation, environment separation, and graceful maintenance workflows.

Start with simple use cases—dedicating GPU nodes or separating environments—then expand to more complex patterns as your cluster matures. Always document your taints, test eviction behavior, and combine with other scheduling primitives like affinity and resource requests for robust workload placement.

Need expert help designing Kubernetes scheduling strategies for your production clusters? Our Kubernetes consulting team specializes in cluster optimization, cost reduction, and operational excellence. We’ve helped organizations save over $250K through intelligent resource allocation and scheduling policies.

What Is Kubernetes Tolerations and Taints: Pod Scheduling Mastery

What Are Kubernetes Taints?

What Are Kubernetes Tolerations?

How Do Taints and Tolerations Work Together?

When Should You Use Taints and Tolerations?

Dedicated Hardware Nodes

Multi-Tenant Isolation

Environment Separation

Node Maintenance and Draining

Common Taint and Toleration Patterns

Node Not Ready Taint

Master Node Taint

Custom Application Taints

Taints vs Node Affinity: What’s the Difference?

How to Remove Taints from Nodes

Toleration Seconds: Time-Based Evictions

Best Practices for Taints and Tolerations

Use Descriptive Taint Keys

Document Your Taints

Start with PreferNoSchedule

Combine with Resource Requests

Audit Tolerations Regularly

Test Eviction Behavior

Troubleshooting Taint and Toleration Issues

Pod Stuck in Pending State

Pods Evicted Unexpectedly

Toleration Not Working

Frequently Asked Questions

What is the difference between taints and tolerations in Kubernetes?

Can a pod have multiple tolerations?

Do tolerations guarantee pod placement?

What happens if I taint a node with NoExecute and pods are already running?

Can I use taints for autoscaling?

Conclusion

Cloud Technology Benefits: Speed, Resilience, Savings

Docker vs Kubernetes: Key Differences Explained

DevOps Kubernetes Playbook: GitOps, Helm, and ArgoCD

Tasrie IT Support

Start a conversation

What Are Kubernetes Taints?

What Are Kubernetes Tolerations?

How Do Taints and Tolerations Work Together?

When Should You Use Taints and Tolerations?

Dedicated Hardware Nodes

Multi-Tenant Isolation

Environment Separation

Node Maintenance and Draining

Common Taint and Toleration Patterns

Node Not Ready Taint

Master Node Taint

Custom Application Taints

Taints vs Node Affinity: What’s the Difference?

How to Remove Taints from Nodes

Toleration Seconds: Time-Based Evictions

Best Practices for Taints and Tolerations

Use Descriptive Taint Keys

Document Your Taints

Start with PreferNoSchedule

Combine with Resource Requests

Audit Tolerations Regularly

Test Eviction Behavior

Troubleshooting Taint and Toleration Issues

Pod Stuck in Pending State

Pods Evicted Unexpectedly

Toleration Not Working

Frequently Asked Questions

What is the difference between taints and tolerations in Kubernetes?

Can a pod have multiple tolerations?

Do tolerations guarantee pod placement?

What happens if I taint a node with NoExecute and pods are already running?

Can I use taints for autoscaling?

Conclusion

Related Articles

Cloud Technology Benefits: Speed, Resilience, Savings

Docker vs Kubernetes: Key Differences Explained

DevOps Kubernetes Playbook: GitOps, Helm, and ArgoCD

Don't Miss Out on Expert DevOps Insights

Get Started

You're In!

Tasrie IT Support

Start a conversation