Why SaaS Companies Overspend on Cloud by 30-50%
After working with hundreds of SaaS companies across every stage from Series A to pre-IPO, I can state with confidence that the vast majority are overspending on cloud infrastructure by 30-50%. This is not a controversial claim among cloud economists -- it is a well-documented pattern supported by data from every major cloud cost management platform.
What makes this particularly consequential for PE investors is that cloud infrastructure is typically the largest component of Cost of Goods Sold for SaaS companies. Every dollar of cloud waste directly erodes gross margin -- the metric that most heavily influences SaaS valuation multiples.
Here are the five most common patterns I see, why growth-stage SaaS companies are especially vulnerable, and what it means for your portfolio.
Pattern 1: Development and Test Environments Running 24/7
This is the single most prevalent source of cloud waste in SaaS companies, and it is staggeringly simple to fix.
A typical SaaS company maintains multiple non-production environments: development, staging, QA, demo, performance testing, and sometimes individual developer sandboxes. These environments are engineered to mirror production -- same instance types, same database configurations, same supporting services -- so that testing accurately reflects production behavior.
The problem: these environments run 24 hours a day, 7 days a week, even though they are actively used only during business hours -- roughly 50 hours out of 168 in a week.
For a SaaS company with $150K/month in cloud spend, non-production environments typically represent 30-40% of the total -- $45K-$60K per month. Of that, 65-70% is waste from running outside active hours. That is $29K-$42K per month in pure waste -- $350K-$500K annually -- from environments that serve no purpose on evenings and weekends.
The fix is straightforward: implement automated scheduling using AWS Instance Scheduler, Lambda functions, or third-party tools. Most companies can implement this within 2-3 weeks. The only complexity is ensuring that scheduled shutdowns do not interfere with automated test suites or batch jobs that may run overnight.
Pattern 2: No Right-Sizing Culture
Cloud instances are provisioned at creation time and almost never re-evaluated. The engineer who spins up the initial infrastructure makes a sizing decision based on their best guess -- and invariably guesses high, because the cost of under-provisioning (outages, poor performance) is visible and immediate, while the cost of over-provisioning (higher monthly bills) is diffuse and invisible.
Over time, this bias compounds. A company that started with a handful of over-provisioned instances two years ago now has hundreds of them, each paying for 2-4x more capacity than it uses. The aggregate waste is substantial.
Why it persists: In most SaaS companies, there is no organizational feedback loop between cloud consumption and the engineers making provisioning decisions. Engineers do not see the cost of the resources they provision. They are measured on feature velocity, uptime, and performance -- not on infrastructure efficiency. Without visibility and accountability, there is no incentive to right-size.
The gross margin impact: I audited a growth-stage SaaS company with $8M ARR and $180K/month in cloud spend. Their average EC2 instance CPU utilization was 11%. After right-sizing -- moving instances to appropriate sizes based on actual utilization -- we reduced compute costs by 38%. That was $68K/month, or $816K annually, flowing directly to improved gross margin. Their gross margin improved from 67% to 77% -- a shift that meaningfully changed how prospective buyers valued the company.
Pattern 3: Over-Provisioned Databases
If compute instances are routinely over-provisioned, databases are worse. Database sizing decisions are made conservatively because the consequences of running out of database capacity are severe: application errors, data loss, or corruption. So engineering teams pick large database instances and leave them.
I commonly see RDS instances running at 5-10% average CPU utilization on db.r5.2xlarge or db.r6g.4xlarge configurations. These are instances with 8-16 vCPUs and 64-128 GB of memory, costing $2,000-$6,000 per month, running workloads that could be served by instances one-third the size.
Worse, many companies run Multi-AZ configurations (which double the cost) for non-production databases where high availability is unnecessary. And they maintain read replicas that were created for a specific project or load test and never decommissioned.
A particularly expensive pattern: Some companies run Amazon Aurora with provisioned capacity sized for their peak historical load -- a load event that occurred once, months ago. Aurora Serverless v2 would automatically scale capacity to match actual demand, potentially reducing database costs by 40-60% for workloads with variable demand.
Pattern 4: No Commitment-Based Pricing
AWS offers discounts of 30-60% compared to On-Demand pricing through Reserved Instances and Savings Plans. These are not complex financial instruments -- they are straightforward commitments to a certain level of usage in exchange for a lower rate.
Yet a remarkable number of SaaS companies -- including companies spending $100K+ per month on AWS -- run entirely on On-Demand pricing. The reasons are usually some combination of:
- "We are growing fast and do not want to commit to specific instance types."
- "We tried to look into it but it was confusing."
- "No one on the team has the expertise to manage a commitment portfolio."
- "We might change our architecture."
These are understandable concerns but they are solvable. Compute Savings Plans provide flexibility across instance families and regions -- they accommodate growth and architectural changes far better than legacy Reserved Instances. And the financial impact of inaction is enormous.
A SaaS company spending $120K/month on EC2 On-Demand with stable workloads covering 70% of that spend could save approximately $25K-$30K/month by purchasing appropriate Savings Plans. That is $300K-$360K per year in savings that requires no engineering work, no architectural changes, and no operational disruption.
Pattern 5: Lift-and-Shift Architecture
Many SaaS companies began their cloud journey by migrating existing on-premises applications to AWS -- the classic "lift and shift." They took their monolithic application, deployed it on EC2 instances that mirrored their physical server configurations, and called it a cloud migration.
The problem is that a lift-and-shift architecture gets the costs of cloud without the benefits. You pay On-Demand rates (higher than owning hardware over 3+ years) without gaining elasticity (the ability to scale up and down with demand), without leveraging managed services (which reduce operational overhead), and without benefiting from cloud-native pricing models.
These architectures are expensive, inflexible, and operationally burdensome. They are also common: I estimate that 40-50% of growth-stage SaaS companies have significant lift-and-shift components in their infrastructure.
The path to optimization is modernization -- containerization, managed services, serverless where appropriate -- but this requires meaningful engineering investment. Budget 6-18 months for significant architectural improvements, with cost savings materializing progressively over that timeline.
Why Growth-Stage SaaS Is Especially Vulnerable
Growth-stage SaaS companies (typically $5M-$50M ARR) are particularly prone to cloud overspending for several structural reasons:
Speed over efficiency: At this stage, the priority is shipping features and acquiring customers. Infrastructure efficiency is explicitly deprioritized -- and understandably so. But the technical debt compounds.
No dedicated infrastructure team: Companies at this stage rarely have a dedicated platform engineering or infrastructure team. Cloud infrastructure is managed part-time by application developers who lack the specialized knowledge to optimize it.
Rapid scaling without optimization: As the customer base grows, the company scales infrastructure horizontally (more instances, bigger databases) without optimizing the existing footprint. Each new customer adds incremental cloud cost at the unoptimized rate.
VC funding masks the problem: When a company has $20M in the bank from a recent funding round, no one is scrutinizing the AWS bill. The focus is on growth metrics -- ARR, NDR, logo count -- not on margin efficiency. The cloud bill is growing, but so is everything else, and it does not seem urgent.
No financial benchmarking: Without benchmarking against peers, companies do not realize their cloud cost as a percentage of revenue is 2-3x higher than optimized competitors. They assume their spending is normal.
The Gross Margin Imperative
For PE investors evaluating SaaS companies, cloud costs are fundamentally a gross margin issue. And gross margin is arguably the single most important metric in SaaS valuation.
Consider two otherwise identical SaaS companies with $20M ARR:
- Company A: 65% gross margin, with cloud COGS representing 20% of revenue
- Company B: 75% gross margin, with cloud COGS representing 10% of revenue (because they have optimized cloud spending)
At a 10x ARR multiple (or equivalently, a higher EBITDA multiple applied to the higher-margin company), Company B is worth significantly more -- and is more attractive to growth equity investors and strategic acquirers who are modeling long-term margin potential.
The path from Company A to Company B is exactly the cloud optimization work described in this article. It is achievable, it is measurable, and it directly impacts valuation.
Ready to evaluate cloud economics in your next deal? Book a free discovery call to discuss your specific situation.