Engineering

Check Node CPU Memory Utilization Kubernetes

Engineering Team

Monitoring resource utilization is fundamental to maintaining healthy Kubernetes clusters. Whether you’re troubleshooting performance issues, planning capacity, or optimizing costs, understanding how to check node CPU and memory usage is essential for any Kubernetes administrator or developer.

Understanding Kubernetes Node Metrics

Before diving into commands, it’s important to understand what you’re measuring. Kubernetes nodes report two key resource metrics:

  • CPU utilization: Measured in millicores (m), where 1000m equals one full CPU core
  • Memory utilization: Measured in bytes, typically displayed as Mi (mebibytes) or Gi (gibibytes)

These metrics reflect actual resource consumption on your worker nodes, helping you identify bottlenecks and optimize workload placement. For comprehensive cluster management, consider exploring our Kubernetes consulting services for expert guidance.

Method 1: Using kubectl top Command

The most straightforward way to check node resource utilization is using the kubectl top command. This requires the Metrics Server to be installed in your cluster.

Check All Nodes

kubectl top nodes

This command displays CPU and memory usage for all nodes in your cluster:

NAME           CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
node-1         850m         42%    3200Mi          40%
node-2         1200m        60%    4800Mi          60%
node-3         450m         22%    2100Mi          26%

Check Specific Node

kubectl top node node-1

This focused view helps when investigating specific node issues or validating changes after optimization work.

Method 2: Installing Metrics Server

If kubectl top returns an error about metrics not being available, you need to install the Metrics Server. This lightweight component collects resource metrics from kubelets and exposes them through the Kubernetes API.

Quick Installation

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

For production environments, you may need to modify the deployment with additional flags:

kubectl patch deployment metrics-server -n kube-system --type='json' -p='[
  {
    "op": "add",
    "path": "/spec/template/spec/containers/0/args/-",
    "value": "--kubelet-insecure-tls"
  }
]'

Verify the installation:

kubectl get deployment metrics-server -n kube-system

The official Kubernetes documentation provides additional configuration options for different cluster setups.

Method 3: Using kubectl describe node

For detailed node information including capacity, allocatable resources, and current usage, use the describe command:

kubectl describe node node-1

This comprehensive output includes:

Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests      Limits
  --------           --------      ------
  cpu                1500m (75%)   3000m (150%)
  memory             6Gi (75%)     8Gi (100%)

This view helps understand resource requests versus limits, which is crucial for Kubernetes cost optimization strategies.

Method 4: Querying the Metrics API Directly

For automation or custom monitoring solutions, you can query the Metrics API directly using kubectl proxy:

kubectl proxy &
curl http://localhost:8001/apis/metrics.k8s.io/v1beta1/nodes

The JSON response contains detailed metrics for all nodes:

{
  "kind": "NodeMetricsList",
  "apiVersion": "metrics.k8s.io/v1beta1",
  "items": [
    {
      "metadata": {
        "name": "node-1"
      },
      "timestamp": "2024-01-15T10:30:00Z",
      "window": "30s",
      "usage": {
        "cpu": "850m",
        "memory": "3355443200"
      }
    }
  ]
}

This programmatic access enables integration with custom dashboards or alerting systems.

Method 5: Production-Grade Monitoring with Prometheus

For production environments, relying solely on kubectl commands isn’t sufficient. Implementing comprehensive observability with Prometheus and Grafana provides historical data, alerting, and better visualization.

Installing Prometheus

If you haven’t set up Prometheus yet, follow our detailed guide on installing Prometheus on Kubernetes for a complete walkthrough.

Key Prometheus Queries for Node Metrics

Once Prometheus is running, use these PromQL queries:

CPU utilization percentage:

100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

Memory utilization percentage:

(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100

Available memory:

node_memory_MemAvailable_bytes / 1024 / 1024 / 1024

These queries provide real-time insights and can trigger alerts when thresholds are exceeded. For expert help implementing observability, check our Prometheus support services.

Method 6: Using Kubernetes Dashboard

The Kubernetes Dashboard offers a visual interface for monitoring cluster resources. After installation, access the dashboard to view:

  • Real-time CPU and memory graphs for each node
  • Pod distribution across nodes
  • Resource allocation and utilization trends

Install the dashboard:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml

Access it via kubectl proxy:

kubectl proxy

Then navigate to: http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/

Method 7: Node-Level Monitoring with cAdvisor

Kubernetes includes cAdvisor (Container Advisor) by default, which collects and exposes container resource usage. Access cAdvisor metrics directly:

kubectl get --raw /api/v1/nodes/node-1/proxy/metrics/cadvisor

This raw metrics endpoint provides detailed container-level statistics that feed into higher-level monitoring systems.

Interpreting Resource Utilization Data

Understanding the numbers is as important as collecting them. Here’s how to interpret common scenarios:

High CPU Utilization (>80%)

  • Immediate action: Check if pods are properly distributed using kubectl get pods -o wide
  • Investigation: Identify CPU-intensive pods with kubectl top pods --all-namespaces
  • Resolution: Consider horizontal pod autoscaling or adding nodes

High Memory Utilization (>85%)

  • Risk: Memory pressure can trigger pod evictions
  • Check: Review memory requests and limits for all pods
  • Action: Adjust resource specifications or scale your cluster

For teams struggling with resource management, our Kubernetes production support can help optimize your cluster configuration.

Best Practices for Node Monitoring

Implementing effective node monitoring requires following these proven practices:

Set Up Alerts

Configure alerts for critical thresholds:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: node-alerts
spec:
  groups:
  - name: node
    rules:
    - alert: NodeMemoryPressure
      expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) > 0.85
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Node memory usage above 85%"

Spot checks are useful, but trend analysis prevents issues before they impact users. Historical data helps with:

  • Capacity planning decisions
  • Identifying gradual resource leaks
  • Optimizing cost by rightsizing nodes

Correlate with Application Metrics

Node metrics alone don’t tell the complete story. Combine them with application-level metrics to understand:

  • Which applications consume the most resources
  • Whether resource usage correlates with traffic patterns
  • If resource limits are appropriately configured

Learn more about comprehensive monitoring in our article on observability best practices.

Troubleshooting Common Issues

Metrics Server Not Available

If kubectl top fails with “Metrics API not available”:

  1. Verify metrics-server is running: kubectl get pods -n kube-system | grep metrics-server
  2. Check logs: kubectl logs -n kube-system deployment/metrics-server
  3. Ensure kubelet ports are accessible (10250 for metrics)

Inaccurate or Missing Metrics

When metrics seem incorrect:

  • Verify node clock synchronization (NTP)
  • Check kubelet is running: systemctl status kubelet
  • Review kubelet configuration for metric collection settings

Performance Impact of Monitoring

Excessive monitoring can consume resources. Balance observability with overhead:

  • Use appropriate scrape intervals (30-60 seconds for most cases)
  • Limit metric retention periods
  • Aggregate metrics at the edge before sending to central systems

The CNCF’s monitoring guidance offers additional recommendations for production environments.

Advanced Monitoring Scenarios

Multi-Cluster Monitoring

Managing multiple clusters requires centralized monitoring:

# Add cluster context labels
kubectl config set-context --current --namespace=monitoring

# Query metrics across clusters using Thanos
kubectl apply -f thanos-query-deployment.yaml

For multi-cluster setups, explore our Thanos support services for scalable monitoring architecture.

Custom Resource Metrics

Extend monitoring beyond CPU and memory:

apiVersion: v1
kind: Service
metadata:
  name: custom-metrics
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8080"

This enables tracking of disk I/O, network bandwidth, or application-specific metrics.

Automated Remediation

Combine monitoring with automation for self-healing clusters:

# Example: Auto-cordon nodes at high utilization
if [ $(kubectl top node node-1 | awk 'NR==2 {print $3}' | sed 's/%//') -gt 90 ]; then
  kubectl cordon node-1
  kubectl drain node-1 --ignore-daemonsets
fi

For sophisticated automation strategies, review our DevOps consulting services.

Cost Optimization Through Monitoring

Effective monitoring directly impacts your cloud spending. By analyzing node utilization patterns:

  • Identify overprovisioned nodes: Nodes consistently below 40% utilization can be downsized
  • Optimize pod placement: Use node affinity and taints to maximize density
  • Implement autoscaling: Scale nodes based on actual demand rather than peak capacity

Our Kubernetes cost optimization guide provides detailed strategies for reducing cluster expenses while maintaining performance.

Conclusion

Monitoring node CPU and memory utilization in Kubernetes is straightforward with the right tools and approaches. Start with kubectl top for quick checks, implement Metrics Server for API access, and graduate to Prometheus for production-grade observability. Regular monitoring prevents resource exhaustion, enables proactive capacity planning, and helps maintain optimal cluster performance.

Remember that monitoring is not a one-time setup but an ongoing practice. As your applications evolve and traffic patterns change, continuously refine your monitoring strategy to maintain visibility into your cluster’s health. Whether you’re managing a small development cluster or a large-scale production environment, these techniques provide the foundation for reliable Kubernetes operations.

For teams looking to enhance their Kubernetes operations with expert guidance, our Kubernetes consulting team can help implement robust monitoring and optimization strategies tailored to your specific needs.

Chat with real humans
Chat on WhatsApp