Spec-Driven Development for AI Agents

4 min read · For teams moving from vibe-coding to repeatable AI development

TL;DR: Spec-driven development gives agents three constitution files — mission.md, tech-stack.md, roadmap.md — plus per-feature plan / requirements / validation docs. The spec only works if the agent cannot drift outside it. ThumbGate is the runtime enforcement layer that blocks tool calls that violate the spec, at the PreToolUse hook layer.

The constitution: three files that define your project

Spec-driven development replaces conversational LLM iteration with a small set of source-of-truth documents stored in the repo. The agent reads them, the team reads them, and they get updated together.

mission.md

The why. What this project is, who it serves, what the non-goals are.

tech-stack.md

Approved technical choices, deployment process, the rails the agent must stay on.

roadmap.md

Phases, planned features, current priorities. Updated as work lands.

Per-feature artifacts

Each feature gets its own dated directory (2026-05-14-feature-name/) with three files:

plan.md — numbered task groups
requirements.md — scope, decisions, context
validation.md — how the feature is considered done

Where spec-driven development breaks

The spec only matters if the agent stays inside it. In practice, the agent reads the constitution into context once, then drifts as the conversation grows. Context compaction evicts tech-stack.md before it evicts the last 200 lines of the chat. Long-running sessions touch files the spec never mentioned.

The hard part is not writing the spec. It is enforcing it.

Two layers: the spec, and the gate

Layer 1 — The spec

Human-authored markdown files in the repo
Read by the agent at session start
Stored as the source of truth
Updated by humans during planning

Layer 2 — The gate

ThumbGate hooks intercept tool calls before execution
Each call is checked against the spec's scope
Out-of-scope writes, destructive commands, and dependency drift are blocked
The agent gets a structured error and a path back to the spec

What ThumbGate actually checks against the spec

ThumbGate ingests the constitution files and uses them as the policy source for PreToolUse checks:

Scope drift: if tech-stack.md says Postgres but the agent runs npm install mongoose, the install is blocked
Path drift: if the current feature's requirements.md lists src/auth/* as in-scope, writes to src/billing/* require confirmation
Validation enforcement: if validation.md says "no merge without integration tests passing", merge tool calls are gated
Phase enforcement: if roadmap.md marks a feature as "Phase 3" and you are in Phase 1, related code paths are protected

Why prompt rules alone do not work

Spec-driven dev usually starts with CLAUDE.md or .cursorrules referencing the constitution. Those files live inside the agent's context. They compete with the live conversation for attention. When the context window pressures up, prompt rules are the first thing to lose weight.

Pre-action checks live outside the agent. They run in a separate process at the hook boundary. The agent cannot reason its way around a closed file handle.

Mental model: The spec is the law. ThumbGate is the bailiff.

Adoption in two steps

Write the constitution. Three files: mission.md, tech-stack.md, roadmap.md. Keep them short.
Install the gate. npx thumbgate init — auto-detects the agent and wires the PreToolUse hooks. Point ThumbGate at the constitution path.

From there, the agent reads the spec into context every session, and ThumbGate enforces it on every tool call.

Stop hoping your agent reads the spec

Spec-driven development is real only when the spec is enforced. Install ThumbGate and let the gate do the policing.

$ npx thumbgate init

Spec-Driven Development for AI Agents

The constitution: three files that define your project

mission.md

tech-stack.md

roadmap.md

Per-feature artifacts

Where spec-driven development breaks

Two layers: the spec, and the gate

Layer 1 — The spec

Layer 2 — The gate

What ThumbGate actually checks against the spec

Why prompt rules alone do not work

Adoption in two steps

Stop hoping your agent reads the spec

Related guides