OpenCost Reference: Kubernetes Cost Monitoring — Allocation API, Cloud Billing & Grafana

OpenCost is a CNCF-graduated, vendor-neutral Kubernetes cost monitoring tool. It allocates cloud infrastructure costs to namespaces, deployments, pods, and labels in real time — answering "how much does my staging namespace cost?" or "which team is burning the most on GPU nodes?"

1. Install & Architecture

Deploy OpenCost on Kubernetes — standalone or with Prometheus

	OpenCost	Kubecost
License	Apache 2.0 (fully open)	Free tier + paid Business/Enterprise
Data retention	15 days (in-memory)	Unlimited (paid)
Multi-cluster	Manual aggregation	Built-in (paid)
Alerts	Via Prometheus rules	Built-in (paid)
Good for	Single cluster, open source, budget visibility	Enterprise, multi-cluster, compliance

# Install OpenCost with Prometheus (recommended):
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/prometheus   --namespace monitoring --create-namespace

helm repo add opencost https://opencost.github.io/opencost-helm-chart
helm install opencost opencost/opencost   --namespace opencost --create-namespace   --set opencost.prometheus.internal.enabled=true   --set opencost.prometheus.internal.namespaceName=monitoring   --set opencost.prometheus.internal.serviceName=prometheus-server

# Access the UI:
kubectl port-forward -n opencost svc/opencost 9090
# Open: http://localhost:9090

# API (data without UI):
kubectl port-forward -n opencost svc/opencost 9003
curl http://localhost:9003/allocation/compute?window=24h | jq .

2. Cost Allocation API

Query costs by namespace, label, deployment, pod, and time window

# All queries via: http://localhost:9003

# Cost by namespace (last 24 hours):
curl "http://localhost:9003/allocation/compute?window=24h&aggregate=namespace" |   jq '.data[0] | to_entries | sort_by(.value.totalCost) | reverse | .[0:10]
      | map({namespace: .key, cost: (.value.totalCost | . * 100 | round / 100)})'

# Cost by label (e.g. team label):
curl "http://localhost:9003/allocation/compute?window=7d&aggregate=label:team" | jq .

# Cost by deployment:
curl "http://localhost:9003/allocation/compute?window=24h&aggregate=deployment" | jq .

# Filter to specific namespace:
curl "http://localhost:9003/allocation/compute?window=24h&aggregate=pod&filterNamespaces=production" | jq .

# Time windows: 1h, 24h, 7d, 30d, or ISO date range: 2026-03-01T00:00:00Z,2026-03-14T23:59:59Z

# Cost breakdown per allocation (what you get back):
# {
#   "cpuCost": 0.43,          # actual CPU used × node price
#   "memoryCost": 0.12,
#   "pvCost": 0.05,           # persistent volume cost
#   "networkCost": 0.02,
#   "gpuCost": 0.00,
#   "totalCost": 0.62,
#   "cpuEfficiency": 0.31,    # requested vs actually used
#   "memoryEfficiency": 0.68
# }

# Efficiency < 0.5 means over-provisioned (good candidate for requests tuning)

3. Configure Cloud Pricing

Connect to AWS/GCP/Azure billing for accurate spot and on-demand prices

# Without cloud billing: OpenCost uses on-demand list prices from public APIs
# With cloud billing: actual prices including discounts, spot rates, committed use

# AWS Cost and Usage Report (CUR) integration:
# 1. Create CUR report in AWS Billing Console → S3 bucket
# 2. Configure OpenCost:
helm upgrade opencost opencost/opencost   --set opencost.cloudCost.enabled=true   --set opencost.cloudCost.provider=aws   --set opencost.cloudCost.aws.accessKeyID=...   --set opencost.cloudCost.aws.secretAccessKey=...   --set opencost.cloudCost.aws.reportName=hourly-cost-report   --set opencost.cloudCost.aws.reportPrefix=cost-reports   --set opencost.cloudCost.aws.reportBucket=my-cur-bucket   --set opencost.cloudCost.aws.region=us-east-1

# GCP Billing Export (BigQuery):
# Enable billing export → BigQuery in GCP Console
helm upgrade opencost opencost/opencost   --set opencost.cloudCost.provider=gcp   --set opencost.cloudCost.gcp.projectID=my-project   --set opencost.cloudCost.gcp.billingDataDataset=billing_data   --set opencost.cloudCost.gcp.billingDataTable=gcp_billing_export_v1

# Spot instance detection (automatic for AWS):
# OpenCost checks instance metadata to detect spot vs on-demand
# Spot instances show real spot price instead of on-demand list price

# Custom pricing (on-prem / bare-metal):
# Create values file:
# opencost.customPricing:
#   CPU: "0.03"     # $/vCPU-hour
#   RAM: "0.004"    # $/GiB-hour
#   storage: "0.0001"  # $/GiB-hour

4. Prometheus & Grafana Integration

Export cost metrics to Prometheus + visualize in Grafana

# OpenCost exposes Prometheus metrics on :9003/metrics:
# container_cpu_allocation          — CPU hours allocated per container
# container_memory_allocation_bytes — memory bytes allocated
# opencost_load_balancer_cost       — LB cost per service
# node_total_hourly_cost            — total node cost/hr
# kubecost_cluster_management_cost  — control plane cost

# Prometheus scrape config (add to prometheus.yml):
scrape_configs:
  - job_name: opencost
    static_configs:
      - targets: ['opencost.opencost.svc.cluster.local:9003']
    metrics_path: /metrics

# Useful PromQL queries:
# Total namespace cost over 24h:
sum by (namespace) (
  rate(container_cpu_allocation[24h]) * on(node) group_left()
  node_cpu_hourly_cost * 24
)

# Grafana dashboards (import from grafana.com):
# Dashboard ID 15919 — OpenCost Cost Monitoring
# Import: Grafana → Dashboards → Import → 15919

# Alerting: namespace over budget:
# In Prometheus alerting rules:
groups:
  - name: costs
    rules:
      - alert: NamespaceOverBudget
        expr: |
          sum by (namespace) (
            rate(container_cpu_allocation[1h]) * 730
          ) > 50   # $50/month threshold
        annotations:
          summary: "Namespace {{ $labels.namespace }} is over $50/month"

5. Common Cost-Saving Patterns

Find over-provisioned workloads, idle nodes, and waste

# 1. Find over-provisioned pods (efficiency < 50%):
curl "http://localhost:9003/allocation/compute?window=7d&aggregate=pod" |   jq '[.data[0] | to_entries[] |
       select(.value.cpuEfficiency < 0.5 and .value.totalCost > 1) |
       {pod: .key, totalCost: .value.totalCost,
        cpuEfficiency: .value.cpuEfficiency,
        memEfficiency: .value.memoryEfficiency}] |
      sort_by(.totalCost) | reverse | .[0:20]'

# 2. Namespace cost trend (is it growing?):
curl "http://localhost:9003/allocation/compute?window=30d&step=1d&aggregate=namespace" |   jq '[.data[] | to_entries[] | select(.key == "production")] | map(.value.totalCost)'

# 3. Find idle nodes (OpenCost + kubectl):
# Nodes with low allocation ratio:
kubectl top nodes    # actual usage
kubectl get nodes -o custom-columns=  NAME:.metadata.name,CPU:.status.allocatable.cpu,MEM:.status.allocatable.memory

# 4. PV cost audit:
curl "http://localhost:9003/allocation/compute?window=30d&aggregate=persistentvolume" |   jq '[.data[0] | to_entries | sort_by(.value.pvCost) | reverse | .[0:10]
      | map({pv: .key, cost: .value.pvCost})]'

# 5. Label untagged resources for accountability:
# Add labels to namespaces + deployments:
kubectl label namespace staging team=backend cost-center=eng
# Then query: curl ".../allocation?aggregate=label:team"

Track OpenCost, Kubernetes, and cloud cost tool releases. ReleaseRun monitors Kubernetes, Docker, and 13+ DevOps technologies.

Related: Prometheus Reference | Grafana Reference | Karpenter Reference

🔍 Free tool: K8s YAML Security Linter — check your OpenCost K8s manifests and related workloads for 12 security misconfigurations.