FinOps & optimisation CloudMay 21, 2026

FinOps in Practice: Cutting Cloud Spend Without Killing Velocity

Rightsizing, commitments, allocation, governance — a pragmatic FinOps playbook for AWS, GCP and Azure, with concrete tools and numbers.

Cloud bills rarely explode overnight. They drift — a forgotten m5.4xlarge here, an over-provisioned GKE node pool there, a Snowflake warehouse left on X-LARGE for a nightly job that finishes in 12 minutes. By the time finance escalates, you're staring at a 30% waste ratio that the FinOps Foundation's State of FinOps 2024 report confirms is the industry norm.

Here's how we approach cost control at DCT — not as a quarterly cleanup, but as an engineering discipline.

1. Start with allocation, not optimization

You can't optimize what you can't attribute. Before touching a single instance type, enforce a tagging contract. On AWS, that means cost-center, environment, service, owner, and data-classification — propagated via Terraform default_tags and validated in CI.

provider "aws" {
  region = "eu-west-3"
  default_tags {
    tags = {
      cost-center = var.cost_center
      environment = var.env
      service     = var.service_name
      owner       = var.team_email
      managed-by  = "terraform"
    }
  }
}

For untaggable resources (data transfer, NAT Gateway, S3 requests), use AWS Cost Categories or GCP Billing labels with allocation rules to split shared costs by usage proxy (e.g., VPC flow logs for egress, request counts for S3). Azure offers similar logic via Cost Management scopes and tag inheritance (GA since 2023).

Rule of thumb: if more than 15% of your bill is "unallocated", your showback reports are fiction.

2. Rightsizing — automate the boring part

Manual rightsizing reviews don't scale. Wire up the native recommenders and treat their output as a backlog:

| Provider | Tool | What it catches | |---|---|---| | AWS | Compute Optimizer + Trusted Advisor | EC2, EBS, Lambda memory, ASG | | GCP | Active Assist Recommender | GCE, GKE, Cloud SQL, IAM over-grants | | Azure | Advisor + VM rightsizing insights | VMs, App Service, SQL DB | | Multi-cloud | OpenCost / Kubecost | Pod-level Kubernetes attribution |

For Kubernetes, Kubecost (or its upstream OpenCost, now a CNCF incubating project) gives you per-namespace, per-deployment cost and idle metrics. Pair it with VPA in recommendation mode to size requests against actual P95 usage:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: api-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api
  updatePolicy:
    updateMode: "Off"  # recommendations only

Feed the recommendations into Jira or Backstage scorecards. Teams own their numbers; platform owns the tooling.

3. Commitments: buy slowly, buy late

Savings Plans, Reserved Instances and Committed Use Discounts can shave 20–55%, but commitments are debt. Three principles:

  • Cover the baseline, not the peak. Look at 90-day minimum utilization, not average. Compute Savings Plans (AWS) are flexible across instance family/region — prefer them over EC2 Instance SP unless you have stable workloads.
  • Stagger expirations. Don't sign three years of commitments on the same day; ladder 1-year terms quarterly to avoid lock-in cliffs.
  • Re-evaluate quarterly. Graviton migration, ARM-based Cloud Run, or a Snowflake-to-DuckDB shift can invalidate your commitment math.

GCP's Flex CUDs (introduced 2024) are now the closest equivalent to AWS Compute Savings Plans — 28% off with 1-year spend commitments, instance-family agnostic. Worth modeling before renewing rigid resource-based CUDs.

4. Governance without bureaucracy

The failure mode of FinOps programs is a monthly steering committee that produces slides nobody reads. Replace it with engineering guardrails:

  • Budget alerts in code. AWS Budgets, GCP Budget API, Azure Budgets — provisioned via Terraform, per service, with anomaly detection enabled.
  • Policy-as-code. Use OPA/Conftest or Checkov to block PRs that provision db.r6i.16xlarge in a dev account, or public S3 buckets, or untagged resources.
  • Showback dashboards in Grafana or the native console, refreshed daily, visible to engineers — not just to FP&A.
  • A weekly cost standup (15 minutes) for the platform team to triage anomalies. That's it.

5. The unit economics layer

The real maturity signal isn't "$X per month" — it's cost per business unit: cost per tenant, per transaction, per million tokens, per active user. This is where FinOps stops being IT cleanup and becomes a product KPI.

For GenAI workloads specifically, track $ / 1K tokens per model and per feature. We've seen teams cut LLM bills 60% by routing 80% of traffic to smaller models (Haiku, Gemini Flash, GPT-4o-mini) and reserving frontier models for the queries that actually need them — a pattern formalized in libraries like LiteLLM and RouteLLM.

FinOps maturity checklist

  • [ ] 95%+ of spend is tagged and allocated to a cost center
  • [ ] Rightsizing recommendations reviewed monthly, tracked as tickets
  • [ ] Commitment coverage between 60–80% of stable baseline
  • [ ] Budget alerts and anomaly detection on every production account
  • [ ] Policy-as-code blocks oversized or untagged resources at PR time
  • [ ] Unit cost metric defined and reported alongside reliability SLOs

Key takeaways

  • Allocation precedes optimization. Fix tagging before chasing instance types.
  • Automate rightsizing with Compute Optimizer, Active Assist or Kubecost — humans only arbitrate.
  • Commit conservatively, ladder expirations, re-evaluate every quarter.
  • Govern through code, not committees: OPA, Checkov, Terraform-managed budgets.
  • Measure unit economics, not just totals — it's the only number a CFO and a tech lead can agree on.
Share this article

Read also