Monitoring resource utilization is fundamental to maintaining healthy Kubernetes clusters. Whether you’re troubleshooting performance issues, planning capacity, or optimizing costs, understanding how to check node CPU and memory usage is essential for any Kubernetes administrator or developer.
Understanding Kubernetes Node Metrics
Before diving into commands, it’s important to understand what you’re measuring. Kubernetes nodes report two key resource metrics:
- CPU utilization: Measured in millicores (m), where 1000m equals one full CPU core
- Memory utilization: Measured in bytes, typically displayed as Mi (mebibytes) or Gi (gibibytes)
These metrics reflect actual resource consumption on your worker nodes, helping you identify bottlenecks and optimize workload placement. For comprehensive cluster management, consider exploring our Kubernetes consulting services for expert guidance.
Method 1: Using kubectl top Command
The most straightforward way to check node resource utilization is using the kubectl top command. This requires the Metrics Server to be installed in your cluster.
Check All Nodes
kubectl top nodes
This command displays CPU and memory usage for all nodes in your cluster:
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
node-1 850m 42% 3200Mi 40%
node-2 1200m 60% 4800Mi 60%
node-3 450m 22% 2100Mi 26%
Check Specific Node
kubectl top node node-1
This focused view helps when investigating specific node issues or validating changes after optimization work.
Method 2: Installing Metrics Server
If kubectl top returns an error about metrics not being available, you need to install the Metrics Server. This lightweight component collects resource metrics from kubelets and exposes them through the Kubernetes API.
Quick Installation
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
For production environments, you may need to modify the deployment with additional flags:
kubectl patch deployment metrics-server -n kube-system --type='json' -p='[
{
"op": "add",
"path": "/spec/template/spec/containers/0/args/-",
"value": "--kubelet-insecure-tls"
}
]'
Verify the installation:
kubectl get deployment metrics-server -n kube-system
The official Kubernetes documentation provides additional configuration options for different cluster setups.
Method 3: Using kubectl describe node
For detailed node information including capacity, allocatable resources, and current usage, use the describe command:
kubectl describe node node-1
This comprehensive output includes:
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1500m (75%) 3000m (150%)
memory 6Gi (75%) 8Gi (100%)
This view helps understand resource requests versus limits, which is crucial for Kubernetes cost optimization strategies.
Method 4: Querying the Metrics API Directly
For automation or custom monitoring solutions, you can query the Metrics API directly using kubectl proxy:
kubectl proxy &
curl http://localhost:8001/apis/metrics.k8s.io/v1beta1/nodes
The JSON response contains detailed metrics for all nodes:
{
"kind": "NodeMetricsList",
"apiVersion": "metrics.k8s.io/v1beta1",
"items": [
{
"metadata": {
"name": "node-1"
},
"timestamp": "2024-01-15T10:30:00Z",
"window": "30s",
"usage": {
"cpu": "850m",
"memory": "3355443200"
}
}
]
}
This programmatic access enables integration with custom dashboards or alerting systems.
Method 5: Production-Grade Monitoring with Prometheus
For production environments, relying solely on kubectl commands isn’t sufficient. Implementing comprehensive observability with Prometheus and Grafana provides historical data, alerting, and better visualization.
Installing Prometheus
If you haven’t set up Prometheus yet, follow our detailed guide on installing Prometheus on Kubernetes for a complete walkthrough.
Key Prometheus Queries for Node Metrics
Once Prometheus is running, use these PromQL queries:
CPU utilization percentage:
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
Memory utilization percentage:
(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100
Available memory:
node_memory_MemAvailable_bytes / 1024 / 1024 / 1024
These queries provide real-time insights and can trigger alerts when thresholds are exceeded. For expert help implementing observability, check our Prometheus support services.
Method 6: Using Kubernetes Dashboard
The Kubernetes Dashboard offers a visual interface for monitoring cluster resources. After installation, access the dashboard to view:
- Real-time CPU and memory graphs for each node
- Pod distribution across nodes
- Resource allocation and utilization trends
Install the dashboard:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
Access it via kubectl proxy:
kubectl proxy
Then navigate to: http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
Method 7: Node-Level Monitoring with cAdvisor
Kubernetes includes cAdvisor (Container Advisor) by default, which collects and exposes container resource usage. Access cAdvisor metrics directly:
kubectl get --raw /api/v1/nodes/node-1/proxy/metrics/cadvisor
This raw metrics endpoint provides detailed container-level statistics that feed into higher-level monitoring systems.
Interpreting Resource Utilization Data
Understanding the numbers is as important as collecting them. Here’s how to interpret common scenarios:
High CPU Utilization (>80%)
- Immediate action: Check if pods are properly distributed using
kubectl get pods -o wide - Investigation: Identify CPU-intensive pods with
kubectl top pods --all-namespaces - Resolution: Consider horizontal pod autoscaling or adding nodes
High Memory Utilization (>85%)
- Risk: Memory pressure can trigger pod evictions
- Check: Review memory requests and limits for all pods
- Action: Adjust resource specifications or scale your cluster
For teams struggling with resource management, our Kubernetes production support can help optimize your cluster configuration.
Best Practices for Node Monitoring
Implementing effective node monitoring requires following these proven practices:
Set Up Alerts
Configure alerts for critical thresholds:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: node-alerts
spec:
groups:
- name: node
rules:
- alert: NodeMemoryPressure
expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) > 0.85
for: 5m
labels:
severity: warning
annotations:
summary: "Node memory usage above 85%"
Monitor Trends Over Time
Spot checks are useful, but trend analysis prevents issues before they impact users. Historical data helps with:
- Capacity planning decisions
- Identifying gradual resource leaks
- Optimizing cost by rightsizing nodes
Correlate with Application Metrics
Node metrics alone don’t tell the complete story. Combine them with application-level metrics to understand:
- Which applications consume the most resources
- Whether resource usage correlates with traffic patterns
- If resource limits are appropriately configured
Learn more about comprehensive monitoring in our article on observability best practices.
Troubleshooting Common Issues
Metrics Server Not Available
If kubectl top fails with “Metrics API not available”:
- Verify metrics-server is running:
kubectl get pods -n kube-system | grep metrics-server - Check logs:
kubectl logs -n kube-system deployment/metrics-server - Ensure kubelet ports are accessible (10250 for metrics)
Inaccurate or Missing Metrics
When metrics seem incorrect:
- Verify node clock synchronization (NTP)
- Check kubelet is running:
systemctl status kubelet - Review kubelet configuration for metric collection settings
Performance Impact of Monitoring
Excessive monitoring can consume resources. Balance observability with overhead:
- Use appropriate scrape intervals (30-60 seconds for most cases)
- Limit metric retention periods
- Aggregate metrics at the edge before sending to central systems
The CNCF’s monitoring guidance offers additional recommendations for production environments.
Advanced Monitoring Scenarios
Multi-Cluster Monitoring
Managing multiple clusters requires centralized monitoring:
# Add cluster context labels
kubectl config set-context --current --namespace=monitoring
# Query metrics across clusters using Thanos
kubectl apply -f thanos-query-deployment.yaml
For multi-cluster setups, explore our Thanos support services for scalable monitoring architecture.
Custom Resource Metrics
Extend monitoring beyond CPU and memory:
apiVersion: v1
kind: Service
metadata:
name: custom-metrics
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
This enables tracking of disk I/O, network bandwidth, or application-specific metrics.
Automated Remediation
Combine monitoring with automation for self-healing clusters:
# Example: Auto-cordon nodes at high utilization
if [ $(kubectl top node node-1 | awk 'NR==2 {print $3}' | sed 's/%//') -gt 90 ]; then
kubectl cordon node-1
kubectl drain node-1 --ignore-daemonsets
fi
For sophisticated automation strategies, review our DevOps consulting services.
Cost Optimization Through Monitoring
Effective monitoring directly impacts your cloud spending. By analyzing node utilization patterns:
- Identify overprovisioned nodes: Nodes consistently below 40% utilization can be downsized
- Optimize pod placement: Use node affinity and taints to maximize density
- Implement autoscaling: Scale nodes based on actual demand rather than peak capacity
Our Kubernetes cost optimization guide provides detailed strategies for reducing cluster expenses while maintaining performance.
Conclusion
Monitoring node CPU and memory utilization in Kubernetes is straightforward with the right tools and approaches. Start with kubectl top for quick checks, implement Metrics Server for API access, and graduate to Prometheus for production-grade observability. Regular monitoring prevents resource exhaustion, enables proactive capacity planning, and helps maintain optimal cluster performance.
Remember that monitoring is not a one-time setup but an ongoing practice. As your applications evolve and traffic patterns change, continuously refine your monitoring strategy to maintain visibility into your cluster’s health. Whether you’re managing a small development cluster or a large-scale production environment, these techniques provide the foundation for reliable Kubernetes operations.
For teams looking to enhance their Kubernetes operations with expert guidance, our Kubernetes consulting team can help implement robust monitoring and optimization strategies tailored to your specific needs.