The problem#
The EKS cluster was running significantly more EC2 capacity than the workloads needed. Node groups were sized for peak load assumptions that never materialised, and most pods had no resource requests set — which meant the scheduler couldn't bin-pack effectively and nodes ran at 20–30% utilisation.
What I changed#
- Audited actual resource usage using
kubectl topand CloudWatch Container Insights to get real CPU and memory baselines per workload. - Set resource requests on all workloads — this alone changed the scheduler's bin-packing behaviour significantly.
- Resized node groups from m5.xlarge to m5.large for the majority of workloads, keeping xlarge only for the two services that genuinely needed it.
- Enabled Cluster Autoscaler with tighter scale-down thresholds so idle nodes were removed more aggressively during off-peak hours.
Why it worked#
The core issue was that without resource requests, the Kubernetes scheduler treats every pod as zero-cost and spreads them loosely. Setting accurate requests lets the scheduler pack pods onto fewer nodes — which then allows the autoscaler to actually scale down.
resources: requests: cpu: "250m" memory: "256Mi" limits: cpu: "500m" memory: "512Mi"Result
34% reduction in EC2 spend with no change to application behaviour, capacity, or SLAs. The freed budget funded two new environments.
Reusable lesson#
EKS cost problems are usually scheduling problems in disguise. Fix the resource requests first — everything else follows from that.
