Most AI-spend platforms — Finout, Vantage, the new "AI FinOps Assistant" wave — focus on showing you the bill after the agent ran: cost allocation, anomaly detection, unit economics finance can trust. A few (Helicone's rate limits, Revenium's Economic Control) add coarse runtime enforcement keyed off $ thresholds or request counts. None of them stop the wasted tool call by understanding why it would have failed.
That's the layer ThumbGate occupies. Every PreToolUse gate that fires is a Claude / GPT call your agent did not make — input tokens you didn't spend, output tokens you didn't spend, retry loop you didn't trigger. The savings are computable, conservative, and now surfaced as a number on your CLI.
"Measurable" is the operative word. A token-spend dashboard tells finance how much got burned; it doesn't tell the CIO board what was averted. thumbgate cost prints a single conservative dollar figure backed by the gate-block count from your machine — not "what enterprises like you saved." That's the artifact that survives a 2026 budget review.
Once ThumbGate is installed and gates have been firing, this is what an operator sees:
$ thumbgate cost
💰 ThumbGate cost-savings — cumulative
──────────────────────────────────────────────────
Tool calls blocked : 247
Tool calls warned : 12
Tool calls passed : 3,401
Top blocker : no-mocked-db (138 blocks)
Tokens you did NOT spend
Input : 494K
Output : 148K
Total : 642K
Estimated $ saved : $3.95
The methodology is intentionally conservative: 2,000 input + 600 output tokens per blocked call, a Sonnet-heavy model mix (80% Sonnet 4.5, 15% Opus 4.6, 5% Haiku 4.5), Anthropic published prices. The goal is "you almost certainly saved at least this much" — not "let's flatter ourselves." Override the mix with --mix '{"claude-sonnet-4-5":1.0}' if your stack is different.
| Capability | Reporting-layer FinOps | ThumbGate (runtime gates) |
|---|---|---|
| See what agents spent last week | ✅ | Partial (via dashboard) |
| Allocate spend to teams / features | ✅ | Per-gate breakdown via byGate |
| Stop a known-bad tool call before it hits the model | ❌ | ✅ — PreToolUse gate fires, no API call made |
| Promote a one-off failure into a permanent gate | ❌ | ✅ — feedback loop + lesson DB |
| Print conservative $ saved per day | ❌ | ✅ — thumbgate cost |
| K8s pod-level allocation, finance-grade reporting | ✅ (that's their core) | ❌ (not our layer) |
The two layers compose. ThumbGate prevents the wasted spend; a reporting FinOps tool tells finance what the remaining spend was for. Picking ThumbGate doesn't mean you don't also need cost visibility — it means the visibility number gets smaller.
gate-stats.json. Not from a marketing model, not from "what enterprises like you saved." Your machine, your gates, your blocks.npx thumbgate init # wire the PreToolUse hook
# ...let your agent run for a few hours...
npx thumbgate cost # see what the gates were worth
Or as JSON, if you want to ship it to a dashboard:
npx thumbgate cost --json | jq .savings.dollarsSaved
The free CLI is real. The paid tier is the hosted dashboard, org-wide rule library, and the operator the Agent Manager doesn't have to be themselves.
Start the Workflow Hardening Sprint Or start Pro at $19/mo →