ThumbGateThumbGate Verification evidence
guide | autoresearch agent safety

Autoresearch Agent Safety for Self-Improving Coding Agents

Autoresearch-style loops can search for better code, but they need gates for holdout tests, proof trails, reward hacking, and unsafe self-improvement.

👍 Thumbs up reinforces good behavior
👎 Thumbs down blocks repeated mistakes

Why this page exists

  • Self-improving coding loops need a control plane before they promote their own wins.
  • ThumbGate turns failed experiment reviews into prevention rules and pre-action gates.
  • The sales wedge is concrete: let the agent search, but gate the evidence before it accepts a variant.

Why Autoresearch creates a new buying moment

Autoresearch-style systems run experiments, inspect results, and keep the variants that look better. That makes them powerful, but it also creates a trust gap for engineering teams.

If the loop can edit the benchmark, skip a holdout, hide a failed run, or promote without proof, the buyer needs enforcement before autonomy expands.

Where ThumbGate fits

  • Block promotion when required primary and holdout checks are missing.
  • Require commands, changed files, logs, and verification evidence before a claimed improvement lands.
  • Capture thumbs-down reviews when an experiment cheats the metric, then promote the pattern into a prevention rule.
  • Use ContextFS packs and Thompson Sampling so recurring research failures get stricter over time.

Starter harnesses that make the value visible

The first pack should wrap checks buyers already understand: npm test, lint, Playwright duration, bundle size, and CI status. Each one becomes a gate the buyer can see firing.

FAQ

Why do Autoresearch-style agents need gates?

A self-improving loop can optimize the wrong signal, skip holdout tests, or promote a cherry-picked run. ThumbGate blocks known-bad promotion patterns before the agent accepts the variant.

What does ThumbGate add to an Autoresearch loop?

ThumbGate adds structured thumbs-up/down feedback, prevention rules, Thompson Sampling, ContextFS proof packs, and pre-action gates for risky experiment and promotion steps.