v0.1 · pre-launch·Design partners wanted.Apply

Kubernetes efficiency, across every cloud you run.

KubeHero is a unified cost and efficiency plane for AKS, GKE, and EKS. Find idle CPU, forgotten namespaces, and underused GPUs — then enforce a hard spending ceiling with a Kubernetes-native spend ceiling.

Stage
Pre-launch
Release
Q2 2026
License
Open core
Clouds
AKS · GKE · EKS
Deploy
Helm / SaaS
Agent
eBPF, read-only
example data · anonymized
cluster-prod-us-east-1·18 nodes·162 pods·live
Right-sizedOvercommitWasting
node-0167%
node-0252%
node-0373%
node-0463%
node-0574%
node-0652%
node-0751%
node-0834%
node-0931%
node-1025%
node-1165%
node-1288%
node-1335%
node-1474%
node-1554%
node-1653%
node-1762%
node-1879%
Recoverable36 pods requesting more than they use$141,670 / mo
last scan · just now

Kubernetes is a scheduler, not an economist.

It does exactly what you ask. And what most teams ask for is 6× more capacity than they actually use. Here's what that looks like at the pod, node, and cluster layer.

01·DIAG-01

Requests are fiction.

Developers set CPU/memory requests once, to avoid the 3AM page. Industry studies report real utilization at ~13% of what pods request. The other 87% is paid-for air.

02·DIAG-02

Limits are scar tissue.

That 16 vCPU limit on a service that uses 0.4 average? Someone set it during an incident six months ago. Nobody touches it because nobody knows why it's there.

03·DIAG-03

GPUs are the silent killer.

A single idle A100 burns ~$32/hour. H100s worse. Most clusters have 30–60% GPU idle time that never shows up in a dashboard until finance opens the invoice.

04·DIAG-04

The autoscaler doesn't know your budget.

Karpenter and the cluster autoscaler optimize for scheduling, not spend. A bad deploy can spawn 400 nodes before anyone notices. By the time Slack lights up, you owe $18k.

One plane for every cluster.
Every dollar accounted for.

KubeHero runs a lightweight DaemonSet on every cluster and streams compressed telemetry to a control plane you host, or that we host for you. No invasive sidecars. No re-architecting. No vendor lock-in.

01·CAP-01

eBPF-accurate telemetry.

Kernel-level pod attribution. Not the 30s-averaged guesswork you get from metrics-server. Per-pod CPU, memory pressure, syscalls, I/O — second-granularity.

02·CAP-02

Unified cloud pricing.

Live EC2 + Savings Plans + Spot for EKS, committed-use for GKE, Spot VMs and Reserved Instances for AKS — one mental model, one cost-per-second number per pod.

03·CAP-03

GPU- and TPU-native.

DCGM-integrated GPU telemetry, tensor core utilization, per-process VRAM, MIG slice efficiency. TPU utilization via GCP SDK. Idle A100? Flagged in 60 seconds.

04·CAP-04

Policy engine + spend ceiling.

Budget CRDs. Automated rightsizing recommendations. A circuit-breaker that evicts runaway pods, caps HPA, or quarantines node pools before a bad deploy melts the card.

kubehero — scan cluster-prod-us-east-1
$ kubehero scan --cluster prod-us-east-1 --report waste
  ↳ connecting to control plane · ok
  ↳ querying 187 nodes · 2,341 pods · 7d window
WASTE REPORT cluster-prod-us-east-1
─────────────────────────────────────────────────────
● vectordb-ingress cpu.request=16 used=0.41 $8,640/mo recoverable
● model-server-a100 gpu=8 util=12% $18,200/mo recoverable
⚠ jobs-etl-nightly limit=32cpu burst=2.1 overcommit risk: HIGH
✓ frontend-gateway cpu.request=2 used=1.6 right-sized
─────────────────────────────────────────────────────
total 47 pods flagged · $38,940/mo recoverable · run `kubehero rightsize` to apply
$

One pane of glass.
AKS, GKE, EKS — side by side.

The dashboard is built for operators, not dashboards-as-art. Spend rolls up from pod to cluster to fleet. Drill down until you see the exact workload wasting the money, then ship the fix — or arm the spend ceiling.

kubehero / fleetlive
Monthly spend
$608,240
+4.2%·vs previous 30d
Recoverable
$184,320
30.3%·of fleet spend
GPU idle share
41.8%
40× A100 / 32× H100·rolling 7d mean
Policies
12 active
0 active·spend ceiling: armed
/// clusters — 6sort · cost · desc
ClusterCloudRegionNodesGPUCost / dayRecoverableState
aks-westeu-prod-01
AKSwesteurope1428× A100$4,820$1,920overcommit
aks-ne-staging
AKSnortheurope24$480$110healthy
gke-usc1-prod
GKEus-central188$2,140$380healthy
gke-euw4-batch
GKEeurope-west46216× L4$1,680$540overcommit
eks-use1-prod
EKSus-east-121032× H100$12,940$5,180overrun
eks-usw2-dev
EKSus-west-238$620$180healthy
fleet/aks-westeu-prod-01· AKS· westeurope· 142 nodes· 8× A100live
/// node pools3 pools · 52 nodes · 8 gpu
system×4
Standard_D4s_v5
CPU
28%
MEM
62%
System + addons
app-burst×40
Standard_D16as_v5 · Spot
CPU
74%
MEM
55%
Stateless workloads
gpu-inference×8
Standard_NC24ads_A100_v4
CPU
22%
MEM
38%
GPU
18%
A100 · inference
insightGPU pool running at 18% mean utilization. 6 of 8 A100s idle > 4h/day.open rightsizing plan
/// top waste — 7d$28,960 / mo
01
model-server-a100$18,200 / mo
ns: ml-inferencegpu=8 util=12%
apply
02
vectordb-ingress$8,640 / mo
ns: retrievalcpu.req=16 used=0.41
apply
03
jobs-etl-nightlyovercommit risk: high
ns: datalimit=32cpu burst=2.1
review
04
frontend-gatewayright-sized
ns: edgecpu.req=2 used=1.6
last scan · 12s agokubehero rightsize --apply →
/// workflow — 004.b

Watch it work — end to end.

Connect a cloud account, stream telemetry, evaluate policies, act — all in under five minutes on a real cluster. Pause any step to read.

demo · kubehero workflow·step 01 / 04

Connect any cluster in under five minutes.

Helm install the agent, paste an OIDC role ARN, and KubeHero discovers every AWS account, Azure subscription, and GCP project in scope.

AWS accounts4
Azure subscriptions2
GCP projects3
Discovered clusters6
AWS✓ connected
Account · 742190-prod
regionus-east-1 · us-west-2
clusterseks-use1-prod · eks-usw2-dev
GPU32× H100
Azure✓ connected
Subscription · 81c7…fe9b
regionwesteurope · northeurope
clustersaks-westeu-prod-01 · aks-ne-staging
GPU8× A100 (NC24ads_v4)
GCP✓ connected
Project · kubehero-prod-euw4
regionus-central1 · europe-west4
clustersgke-usc1-prod · gke-euw4-batch
GPU16× L4
3 clouds · 6 clusters · 502 nodes · 12,480 podsmTLS · OIDC · read-only by default
/// operator console — 004.d

Live telemetry, not yesterday's PDF.

Panels update every second from a real ClickHouse feed. Hover the burn-rate chart to scrub back through the window.

kubehero / operator console· fleet: prod-* · last 2h · refresh 1s
live
Fleet burn rateUSD / hr · prod clusters · 120s window
$4,481avg $4,480healthy
−120s−60snow
Top workload waste$k / mo recoverable · rolling 7d
model-server-a100$18.0k
ns: ml-inferencerank 1
vectordb-ingress$9.3k
ns: retrievalrank 2
etl-nightly$5.7k
ns: datarank 3
frontend-gateway$4.3k
ns: edgerank 4
api-ingress$3.1k
ns: edgerank 5
GPU utilization heatmap8 GPUs × 48s · darker = idle, bright = loaded
gpu-01
90%
gpu-02
58%
gpu-03
53%
gpu-04
33%
gpu-05
44%
gpu-06
31%
gpu-07
35%
gpu-08
0%
idle · 0–25%light · 25–55%mid · 55–85%loaded · 85%+
Alert feedlive · signed · SIEM exportable
09:14:00burn rate 1.3× on prod-us-east-1
09:14:01new recommendation · vectordb-ingress · cpu 16→4
09:14:02rightsizing applied · model-server-a100
09:14:03ceiling crossed · prod-monthly · 82% of $100k
09:14:04cluster discovered · gke-euw4-batch · 62 nodes
source · eBPF + DCGM · via collector DaemonSetquery = ch.pod_cost_1s · WHERE cluster ~ "prod-*"
/// spend attribution — 004.e

Follow the money, from namespace to invoice.

Ribbon thickness is $/mo. Hover a node or a flow — everything else dims so you can see exactly which team's workload is running on which cloud, and what it costs.

kubehero / spend attribution· namespace → workload → cloud · rolling 30d
total $163k·hover or click to filter
model-server-a100ns ml-inference AWS$45.0k
model-server-a100ns ml-inference Azure$37.0k
vectordb-ingressns retrieval AWS$22.0k
etl-nightlyns data GCP$16.0k
vectordb-ingressns retrieval GCP$12.0k
frontend-gatewayns edge AWS$8.0k
api-ingressns edge Azure$7.0k
etl-nightlyns data AWS$6.0k
frontend-gatewayns edge GCP$3.0k
api-ingressns edge AWS$2.0k
click a column label or bar to pin · esc to clearsource · ch.pod_cost_1d · GROUP BY ns, workload, cloud
/// live edge — 004.c

Three things legacy FinOps tools can't do.

Sub-minute telemetry, retroactive Savings Plan re-attribution, and an enforcement layer with humanArm: true. Flexera, Cloudability, and their peers are structurally incapable of any of these.

/// 001flexera · 24h stale · showing 2026-04-22

Live burn rate

$4,820.40/ hr
–60snow
delta vs 1h avg+$380
last tick2s ago
resolution1s
/// 002flexera · no re-attribution · SP applies forward only

Savings Plan replay

–17.8%retroactive · 22d window
09:14:021Y compute Savings Plan committed — $960,000
09:14:03re-attribution started · 28.4M rows
09:17:48cost restated back to 2026-04-01 · –17.8%
before
$0.1872
/ pod-hr
after
$0.1538
/ pod-hr
/// 003flexera · alerts only · no enforcement layer

Ceiling policies

3 armed· 0 active
prod-monthly-ceilingarmed
kind: Budget·scope: prod-* clusterseval 4s ago
prod-burn-rate-2xarmed
kind: CeilingPolicy·scope: prod-us-east-1eval 4s ago
gpu-inference-capstandby
kind: CeilingPolicy·scope: ns:ml-inferenceeval 12s ago
human-arm requiredkubehero cap --arm →

Declare what you refuse to spend.
KubeHero enforces it.

Most cost tools report yesterday's damage. KubeHero lets you define a hard ceiling as a Kubernetes CRD and acts in real time when a bad deploy, a runaway cron, or a forgotten dev namespace starts overrun budget.

01Scale HPAs down to safe minimum
02Evict non-SLO workloads
03Quarantine offending node pools
04Page on-call, post to Slack
apiVersion: kubehero.io/v1
kind: BudgetPolicy
spec:
  ceiling: $8400/hr
  hardStop: true
  humanArm: true
  escalation: [hpa, evict, quarantine, page]
Simulated budget breach
Demo · no real clusters harmed
Burn rate
$11,217/hr
Ceiling
$8,400/hr
Overage
+$2,817/hr
Step 01 of 02 · Arm the policy
Arming will expose the execute control. This demo will evict simulated workloads and cannot be undone mid-flight (cooldown applies).
> awaiting execution...

Free until it pays for itself.

Three ways to run KubeHero. Start free, move to Cloud when you want the hosted brain, self-host with a commercial license when compliance demands it. No seat taxes. No surprise bills.

01·TIER-01
Free
OSS · self-hosted
$0
forever · Apache 2.0
  • eBPF agent (DaemonSet)
  • Basic dashboard & CLI
  • 3 clusters · 7-day retention
  • Community Discord
  • GitHub issues
Clone on GitHub
Recommended
02·TIER-02
Cloud
hosted control plane
$10
per node / month · first 25 nodes free
  • Everything in Free
  • Managed control plane
  • Unlimited clusters · 90-day retention
  • Slack / PagerDuty / OpsGenie integrations
  • Budget CRDs + spend ceiling
  • Email support · 24h SLA
Request access
03·TIER-03
Enterprise
self-hosted · BSL commercial
Custom
air-gap capable
  • Everything in Cloud
  • SSO (SAML, OIDC) + SCIM
  • Multi-tenant RBAC
  • Unlimited retention
  • On-prem / air-gapped deploy
  • Dedicated solutions engineer
  • 99.95% SLA
Talk to us

Onboarding design partners now.

We work directly with a small group of operators running real AKS, GKE, or EKS footprints — especially teams managing a GPU fleet. Design partners get hands-on setup, monthly roadmap input, and first-year pricing locked in.

01Production clusters across one or more of AKS / GKE / EKS
02Monthly cloud spend above $50K, or a meaningful GPU/TPU fleet
03A human who owns K8s cost end-to-end and can give us 30 min / week
No newsletter, no promo spam. Unsubscribe any time.