§ SPEC-DRIVEN EVAL FOR CODING AGENTS public release · 2026.04

From a markdown spec to a verified merge, autonomously.

Agents write faster than your team can review. You write the intent. Your coding agent writes the code. Sigil handles the rest.

Install Sigil Gate a first PR How it works

Install

curl -fsSL https://runsigil.com/install.sh | sh

§ 01 · THE LOOP

Five steps from spec to verdict.

Sigil compiles your markdown spec into end-to-end test scenarios, runs them in CI against every PR (including scenarios the agent never sees), and emits a verdict gated by a trust ledger the service has to earn. You show up to audit, not to operate.

01 SPEC

you · human

Write the intent

A plain markdown spec — acceptance criteria, edge cases, invariants. One source of truth for what "done" means.
02 GENERATE

sigil

Compile to scenarios

Sigil turns the spec into end-to-end test scenarios — a visible bundle the agent iterates against, and an age-encrypted holdout it never sees.
03 IMPLEMENT

agent · any

Write the code

Claude Code, Cursor, Devin, or your own. The agent writes the implementation and runs the visible scenarios locally to verify its work before pushing.
04 EVAL

sigil · ci

Run the full suite

CI runs every scenario — visible and hidden — against the PR and a baseline. Every score lands in a signed, append-only ledger.
05 VERDICT

sigil

Allow, review, or block

A signed verdict, gated by a trust ledger the service has to earn. At the top trust tier, the merge happens on its own.