§ SPEC-DRIVEN EVAL FOR CODING AGENTS public release · 2026.04

From a markdown spec to a verified merge, autonomously.

Agents write faster than your team can review. You write the intent. Your coding agent writes the code. Sigil handles the rest.

Install
curl -fsSL https://runsigil.com/install.sh | sh
§ 01 · THE LOOP

Five steps from spec to verdict.

Sigil compiles your markdown spec into end-to-end test scenarios, runs them in CI against every PR (including scenarios the agent never sees), and emits a verdict gated by a trust ledger the service has to earn. You show up to audit, not to operate.

  1. 01 SPEC
    you · human
    Write the intent
    A plain markdown spec — acceptance criteria, edge cases, invariants. One source of truth for what "done" means.
  2. 02 GENERATE
    sigil
    Compile to scenarios
    Sigil turns the spec into end-to-end test scenarios — a visible bundle the agent iterates against, and an age-encrypted holdout it never sees.
  3. 03 IMPLEMENT
    agent · any
    Write the code
    Claude Code, Cursor, Devin, or your own. The agent writes the implementation and runs the visible scenarios locally to verify its work before pushing.
  4. 04 EVAL
    sigil · ci
    Run the full suite
    CI runs every scenario — visible and hidden — against the PR and a baseline. Every score lands in a signed, append-only ledger.
  5. 05 VERDICT
    sigil
    Allow, review, or block
    A signed verdict, gated by a trust ledger the service has to earn. At the top trust tier, the merge happens on its own.