According to new research, Kubernetes over-provisioning is among the leading causes of rising cloud coasts.
The company's analysis showed that only 13% of provisioned CPUs and 20% of provisioned memory were used by DevOps teams in Kubernetes clusters larger than 50 CPUs, according to 2024 from CAST.AI. Kubernetes Cost Benchmark report.
In larger clusters with 1000 or more CPUs, the CPU utilization rate was only marginally higher at 17%.
However, 'megaclusters' with 30,000 CPUs or more were utilized at a rate of around 44%, likely due to the “attention they receive from the large DevOps teams that manage them.”
The report attributes this over-provisioning to anxiety on the part of DevOps teams. For fear of running out of memory, they plan to oversize their Kubernetes clusters.
These large clusters largely sit “idle in the background,” driving up cloud costs, the study noted. A similar effect is achieved when CPU and memory requests are set “higher than what Kubernetes applications actually require,” resulting in more wasted capacity.
For example, the report's analysis of “spot instance” pricing found that costs had increased by 25% in US AWS regions.
This is also an issue with general Kubernetes practice across the board, the report noted, with DevOps teams across all major platforms reporting cloud waste as a result of over-provisioning.
Although cloud waste is slightly lower on Google Cloud Platform (GCP), with utilization at 17%, utilization rates on Amazon Web Services (AWS) and Microsoft's Azure sit at rates of 11%.
The report predicts that this trend will continue as the gap between provisioned and requested CPUs has widened in recent years from 37% to 43%.
“As more enterprises adopt Kubernetes, cloud waste is likely to continue to grow,” CAST said.
The report was compiled from an analysis of 4,000 clusters of at least 50+ CPUs running on AWS, GCP, and Microsoft's Azure.
Kubernetes costs have increased steadily over the past few years.
Kubernetes has been causing cost issues for a while now. A 2023 survey by Civo found that 10% of developers have experienced a 50% increase in annual spending, while the majority experienced up to a 25% increase in spending.
Organizations and enterprise Kubernetes users should move toward using more custom instance sizes, the CAST report suggests.
Many DevOps teams start by defining workloads and then scouring cloud inventories for the best tools, but most large cloud providers offer an overwhelming “scale” of options.
This means that most teams simply choose options they know, which often leads to underutilization of other resources they have paid for.
However, dynamic and automated instances can ensure that clusters are automatically optimized.
It is also important for teams to establish “precise requests that require the ability to constantly monitor resource consumption,” which will in turn allow them to adjust requests as necessary.