Those of you who use Google's Kubernetes Engine have likely received a notice recently about GCP adding a $0.10 per hour management fee for all GKE clusters. While this does not represent a significant amount of money for most deployments, it has triggered a bit of a discussion around what it really costs to run a Kubernetes cluster on the various popular cloud providers.
With the recent release of our multi-cloud Kubernetes platform we have been running many clusters on many clouds and have been thinking a lot about how to measure and manage costs.
In theory the set of services required to run a web app on Kubernetes is relatively simple. You will need:
Putting storage aside because it is so variable, and ultimately not that expensive, let's take a look at the "list" cost of these other three items on the various clouds.
A few notes:
We tried to select the the most comparable instances across clouds but there is some variation.
The load balancer costs are approximate because they depend on rule count, data processed, etc... and that pricing varies by cloud.
total $6.41 per day
total $6.42 per day
* GCP will not be charging its hourly cluster management fee until June 6th 2020. If you are running a cluster before then, or you are running a cluster in only a single zone, you will not incur this fee.
total $5.79 per day
total $2.49 per day
Most likely you are going to want to run larger instances with memory being the major need for most applications. If we normalize the instance cost around available memory the costs break down as follows:
So this all seems pretty straightforward. AWS and GCP are rather close in price, Azure is a little cheaper and DigitalOcean is a real bargain, but the story isn't quite that simple.
You can use a pricing calcuator from each cloud to do the math above but somehow that never seems to match your actual bill once you are in production. The first trick is understanding what combination of services you are using to host your application(s). In the early days of cloud hosting it was quite simple as most people hosted on AWS and you really only had to pay attention to the price of your EC2 instances and perhaps some EBS, S3, and ELB services. Running a production grade Kubernetes cluster is a bit more complex.
To look at a concrete example, let’s see what it takes to run a default Kubernetes-based Convox Rack that’s ready to host a simple web application on AWS. In our case, Convox will automatically provision for you:
By default we use t3.small instances which have 2 vCPUs and 2GB of memory. As of this writing, the cost of running this cluster in us-east-1 is $5.92 per day with the top three charges being:
Once you start scaling, or needing services like RDS these costs will change significantly but this is a good baseline for a simple production ready cluster.
So now let's look at the cost of running the same Convox cluster on the other clouds:
Using their Preemptible N1-standard-1 instances (1 vCPU and 3.75GB of memory) will run $4.13 per day with the top three charges being:
We found with GCP we are able to get away with running preemptible instances which are much cheaper than regular instances. Non preemptible N1-standard-1 instances would bring the total cost to $6.67 per day for the same 3 instance cluster
For Azure to achieve a stable and reliable cluster we found we needed to run their Standard_D2_v3 instances (2 vCPU and 8GB of memory) which drives the daily cost of a cluster to $9.97 per day
With the top three charges being:
Using their s-2vcpu-4gb droplet (2 vCPU and 4GB of memory) will run $2.22 per day with the top three charges being:
As you can see the costs for equivalent clusters can vary pretty significantly across clouds. One thing you might notice right away is that underlying instance specs (vCPU/Ram) are a bit different between clouds. The default instance sizes that we selected are the smallest instances that we were able to use and still have reliable performance for the specific provider and we run three instances per cluster by default.
This highlights one of the reasons we built our multi-cloud Racks which is that different workloads can have noticeable cost differences across clouds depending on the specific resources or underlying services they require. If your workload is more CPU intensive than RAM intensive, or if you require large quantities of block storage, your costs could really vary across providers. Of course these differences can be significantly magnified as you scale. Depending on the specific needs of your application choosing the cost-optimized cloud provider for your requirements can mean big savings.
There are a number of cost management and savings strategies for each of the cloud providers which we have learned, and several that we have built into our platform, but I will save those for an upcoming post.
If you would like to save yourself the headache of provisioning, managing, and cost-optimizing all this infrastructure yourself, I encourage you to check out our free and open source Convox multi-cloud Racks. With support for AWS, GCP, DigitalOcean and Azure we allow you to run on the cloud that is the best fit for you!