FinOps in Practice: Cutting Cloud Spend Without Killing Velocity
Rightsizing, commitments, allocation, governance — a pragmatic FinOps playbook for AWS, GCP and Azure, with concrete tools and numbers.
Cloud bills rarely explode overnight. They drift — a forgotten m5.4xlarge here, an over-provisioned GKE node pool there, a Snowflake warehouse left on X-LARGE for a nightly job that finishes in 12 minutes. By the time finance escalates, you're staring at a 30% waste ratio that the FinOps Foundation's State of FinOps 2024 report confirms is the industry norm.
Here's how we approach cost control at DCT — not as a quarterly cleanup, but as an engineering discipline.
1. Start with allocation, not optimization
You can't optimize what you can't attribute. Before touching a single instance type, enforce a tagging contract. On AWS, that means cost-center, environment, service, owner, and data-classification — propagated via Terraform default_tags and validated in CI.
provider "aws" {
region = "eu-west-3"
default_tags {
tags = {
cost-center = var.cost_center
environment = var.env
service = var.service_name
owner = var.team_email
managed-by = "terraform"
}
}
}
For untaggable resources (data transfer, NAT Gateway, S3 requests), use AWS Cost Categories or GCP Billing labels with allocation rules to split shared costs by usage proxy (e.g., VPC flow logs for egress, request counts for S3). Azure offers similar logic via Cost Management scopes and tag inheritance (GA since 2023).
Rule of thumb: if more than 15% of your bill is "unallocated", your showback reports are fiction.
2. Rightsizing — automate the boring part
Manual rightsizing reviews don't scale. Wire up the native recommenders and treat their output as a backlog:
| Provider | Tool | What it catches | |---|---|---| | AWS | Compute Optimizer + Trusted Advisor | EC2, EBS, Lambda memory, ASG | | GCP | Active Assist Recommender | GCE, GKE, Cloud SQL, IAM over-grants | | Azure | Advisor + VM rightsizing insights | VMs, App Service, SQL DB | | Multi-cloud | OpenCost / Kubecost | Pod-level Kubernetes attribution |
For Kubernetes, Kubecost (or its upstream OpenCost, now a CNCF incubating project) gives you per-namespace, per-deployment cost and idle metrics. Pair it with VPA in recommendation mode to size requests against actual P95 usage:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: api-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: api
updatePolicy:
updateMode: "Off" # recommendations only
Feed the recommendations into Jira or Backstage scorecards. Teams own their numbers; platform owns the tooling.
3. Commitments: buy slowly, buy late
Savings Plans, Reserved Instances and Committed Use Discounts can shave 20–55%, but commitments are debt. Three principles:
- Cover the baseline, not the peak. Look at 90-day minimum utilization, not average. Compute Savings Plans (AWS) are flexible across instance family/region — prefer them over EC2 Instance SP unless you have stable workloads.
- Stagger expirations. Don't sign three years of commitments on the same day; ladder 1-year terms quarterly to avoid lock-in cliffs.
- Re-evaluate quarterly. Graviton migration, ARM-based Cloud Run, or a Snowflake-to-DuckDB shift can invalidate your commitment math.
GCP's Flex CUDs (introduced 2024) are now the closest equivalent to AWS Compute Savings Plans — 28% off with 1-year spend commitments, instance-family agnostic. Worth modeling before renewing rigid resource-based CUDs.
4. Governance without bureaucracy
The failure mode of FinOps programs is a monthly steering committee that produces slides nobody reads. Replace it with engineering guardrails:
- Budget alerts in code. AWS Budgets, GCP Budget API, Azure Budgets — provisioned via Terraform, per service, with anomaly detection enabled.
- Policy-as-code. Use OPA/Conftest or Checkov to block PRs that provision
db.r6i.16xlargein a dev account, or public S3 buckets, or untagged resources. - Showback dashboards in Grafana or the native console, refreshed daily, visible to engineers — not just to FP&A.
- A weekly cost standup (15 minutes) for the platform team to triage anomalies. That's it.
5. The unit economics layer
The real maturity signal isn't "$X per month" — it's cost per business unit: cost per tenant, per transaction, per million tokens, per active user. This is where FinOps stops being IT cleanup and becomes a product KPI.
For GenAI workloads specifically, track $ / 1K tokens per model and per feature. We've seen teams cut LLM bills 60% by routing 80% of traffic to smaller models (Haiku, Gemini Flash, GPT-4o-mini) and reserving frontier models for the queries that actually need them — a pattern formalized in libraries like LiteLLM and RouteLLM.
FinOps maturity checklist
- [ ] 95%+ of spend is tagged and allocated to a cost center
- [ ] Rightsizing recommendations reviewed monthly, tracked as tickets
- [ ] Commitment coverage between 60–80% of stable baseline
- [ ] Budget alerts and anomaly detection on every production account
- [ ] Policy-as-code blocks oversized or untagged resources at PR time
- [ ] Unit cost metric defined and reported alongside reliability SLOs
Key takeaways
- Allocation precedes optimization. Fix tagging before chasing instance types.
- Automate rightsizing with Compute Optimizer, Active Assist or Kubecost — humans only arbitrate.
- Commit conservatively, ladder expirations, re-evaluate every quarter.
- Govern through code, not committees: OPA, Checkov, Terraform-managed budgets.
- Measure unit economics, not just totals — it's the only number a CFO and a tech lead can agree on.
Read also
- FinOps & optimisation CloudApril 27, 2026
FinOps in Practice: Cutting Cloud Costs Without Slowing Teams
Rightsizing, Savings Plans, allocation tags and governance: a pragmatic FinOps playbook for AWS, GCP and Azure that engineering teams will actually adopt.
Read article - Click & CollectJune 12, 2026
Click & Collect: How to Let Customers Order Online and Pick Up in Store
Hélène runs a cheese shop in Annecy. Last winter she added online ordering with in-store pickup. Here's exactly how she did it — and what it cost.
Read article - Agents IA & automatisationJune 11, 2026
AI Agents in Production: MCP, Tool Use, and Orchestration
Beyond the demos: how to architect autonomous agents with MCP, tool use, and multi-agent orchestration for real enterprise workloads.
Read article
