“Cloud Wisdom Weekly: for tech companies and startups” is a new blog series we’re running this fall to answer common questions our tech and startup customers ask us about how to build apps faster, smarter, and cheaper. In this installment, Google Cloud Product Manager Rachel Tsao explores how to save on compute costs with modern container platforms. 

Many tech companies and startups are built to operate under a certain degree of pressure and to efficiently manage costs and resources. These pressures have only increased with inflation, geopolitical shifts, and supply chain concerns, however, creating urgency for companies to find ways to preserve capital while increasing flexibility. The right approach to containers can be crucial to navigating these challenges. 

In the last few years, development teams have shifted from virtual machines (VMs) to containers, drawn to the latter because they are faster, more lightweight, and easier to manage and automate. Containers also consume fewer resources than VMs, by leveraging shared operating systems. Perhaps most importantly, containers enable portability, letting developers put an application and all its dependencies into a single package that can run almost anywhere. 

Containers are central to an organization’s agility, and in our conversations with customers about why they choose Google Cloud, we hear frequently that services like Google Kubernetes Engine (GKE) and Cloud Run help tech companies and startups to not only go to market quickly, but also save money. In this article, we’ll explore five ways to help your business quickly and easily reduce compute costs with containers. 

5 ways to control compute costs with containers 

Whether your company is  an established player that is modernizing its business or a startup building its first product, managed containerized products can help you reduce costs, optimize development, and innovate. The following tips will help you to evaluate core features you should expect of container services and include specific advice for GKE and Cloud Run.

1. Identify opportunities to reduce cluster administration 

Most companies want to dedicate resources to innovation, not infrastructure curation. If your team has existing Kubernetes knowledge or runs workloads that need to leverage machine types or graphics processing units (GPUs), you may be able to simplify provisioning with GKE Autopilot. GKE Autopilot provisions and manages the cluster’s underlying infrastructure, all while you pay for only the workload, not 24/7 access to the underlying node-pool compute VMs. In this way, it can reduce cluster administration while saving you money and giving you hardened security best practices by default.

2. Consider serverless to maximize developer productivity 

Serverless platforms continue the theme of empowering your technical talent to focus on the most impactful work. Such platforms can promote productivity by abstracting away aspects of infrastructure creation, letting developers work on projects that drive the business while the platform provider oversees hardware and scalability, aspects of security, and more.  

For a broad range of workloads that don’t need machine types or GPUs, going serverless with Cloud Run is a great option for building applications, APIs, internal services, and even real-time data pipelines. Analyst research supports that Cloud Run customers achieve faster deployments with less time spent monitoring services, resulting in reinvested productivity that lets these customers do more with fewer resources.  

Designed with high scalability in mind, and an emphasis on the portability of containers, Cloud Run also supports a wide range of stateless workloads, including jobs that run to completion. Moreover, it lets you maximize the skills of your existing team, as it does not require cluster management, a Kubernetes skillset or prior infrastructure experience. Additionally, Cloud Run leverages the Knative spec and a container image as a deployment artifact, enabling an easy migration to GKE if your workload needs change.

With Cloud Run, gone are the days of infrastructure overprovisioning! The platform scales down to zero automatically, meaning your services always have the capacity to meet demand, but do not incur costs if there is no traffic. 

3. Save with committed use discounts

Committed use discounts provide discounted pricing in exchange for committing to a minimal level of usage in a region for a specified term. If you are able to reliably predict your resource needs, for instance, you can get a 17% discount for Cloud Run (for either one year or three years), and either a 20% discount (for one year) or a 45% discount (for three years) on GKE Autopilot.

4. Leverage cost management features 

Minimum and maximum instances are useful for ensuring your services are ready to receive requests but do not cause cost overages. For Google Cloud customers, best practices for cost management include building your container with Cloud Build, which offers pay-for-use pricing and can be more cost efficient than steady-state build farms.

Relatedly, if you choose to leverage serverless containers with Cloud Run, you can set minimum instances to avoid the lag (i.e., the cold start) when a new container instance is starting up from zero. Minimum instances are billed at one-tenth of the general Cloud Run cost. Likewise, if you are testing and want to avoid costs spiraling, you can set a maximum number of instances to ensure your containers do not scale beyond a certain threshold. These settings can be turned off anytime, resulting in no costs when your service is not processing traffic. To have better oversight of costs, you can also view built-in billing reports and set budget alerts on Cloud Billing. 

5. Match workload needs to pricing models

GKE Autopilot is great for running highly reliable workloads thanks to its Pod-level SLA. But if you have workloads that do not need a high level of reliability (e.g., fault tolerant batch workloads, dev/test clusters), you can leverage spot pricing to receive a discount of 60% to 91% compared to regularly-priced pods. Spot Pods run on spare Google Cloud compute capacity as long as resources are available. GKE will evict your Spot Pod with a grace period of 25 seconds during times of high resource demand, but you can automatically redeploy as soon as there is available capability. This can result in significant savings for workloads that are a fit. 

Innovation requires balance 

Put into practice, these tips can help you and your business to get the most out of containers while controlling management and resource costs. That said, it is worth noting that while managing cloud costs is important, the relationship between “cloud” and “cost” is often complex. If you are adopting cloud computing with only the primary goal of saving money, you may soon run into other challenges. Cloud services can save your business money in many ways, but they can also help you get the most value for your money. This balance between cost efficiency and absolute cost is important to keep in mind so that even in challenging economic landscapes, your tech company or startup can continue growing and innovating. 

Beyond cost savings, many tech and startup companies are seeking improved business agility, which is the ability to deploy new products and features frequently and with high quality. With deployment best practices built into GKE Autopilot and Cloud Run, you can transform the way your team operates while maximizing productivity with every new deployment. 

You can learn if your existing workloads are appropriate for containers with this fit assessment and these guides for migrating to containers. For new workloads, you can leverage these guides for GKE Autopilot and Cloud Run. And for more tips on cost optimization, check out our Architecture Framework for compute, containers, and serverless.