Skip to content

aaione/loop-engineering

Repository files navigation

Governor

A governor for autonomous coding loops — speed limiter, circuit breaker, and done-as-proof. It never provides the power.

Status: enterprise-grade deterministic shell (GATE-B true-bill reconciliation remains). The deterministic runtime is implemented + tested (47 tests): L1 verifier (secret-stripped env), FSM, four circuit breakers (iteration / no-progress / per-run budget / global-daily ceiling), atomic STATE + stop_proof writeback, capability gate, contract_ref pin, runtime forbidden_paths enforcement + node_modules tamper detection, run lock (concurrent-run refusal), the live L2 heterogeneous judge, and claude + codex makers (--maker) with pause-and-review (govern ack reject|approve). Remaining: true-bill invoice reconciliation (GATE-B) — cost is labeled true_bill/estimate/none on every proof but is not yet reconciled against vendor invoices.


What it is (and isn't)

Governor is a deterministic shell around a model core.

A coding-agent loop is an engine (it provides power). Governor is not an engine and does not compete with schedulers (/loop, Codex Automations, cron). It is the feedback controller that makes an autonomous loop safe to leave unattended: it limits speed, breaks the circuit on runaway, and turns the loop's done from a claim into a proof.

The reliability of a loop is proportional to how many lying-prone decisions you take out of the model's hands. Governor's entire design takes each unreliable decision point — when to stop, how much was spent, whether progress was real, whether it truly passed — and returns it to deterministic code.

Governor is not: another loop runner, a daemon, a VS Code extension, a SaaS, a Cursor integration, or an attempt to auto-loop tasks with no objective failure signal (those are refused and downgraded to candidate-generation + human review by design).

Architecture, in one line

Trigger → State → Maker/Checker → Verifier (L1 deterministic > L2 heterogeneous judge > L3 maker self-report, which can never stop) → Aggregator (decide) → atomic state writeback, with three circuit breakers (iteration cap / no-progress window / true-bill dollar budget) wrapping every iteration. Scheduling, state serialization, breaker tripping, and stop decisions are pure deterministic code the model may never touch.

See ./docs/governor-final-review.md for the full architecture, and ./docs/governor-decisions.md for the decision pack. ./docs/ also holds the broader Loop-Engineering research this project grew out of.

Packages

Package Responsibility
@governor/contract loop-contract.yaml schema + static validation
@governor/state state.toml schema + Status state machine + StateStore (atomic write)
@governor/guards Three deterministic circuit breakers (iteration / no-progress / budget)
@governor/verifier stop_proof v1 schema, trust-gradient branded types, Verifier/Aggregator
@governor/adapter RuntimeAdapter interface + CapabilityDescriptor
@governor/adapter-claude Claude Code adapter (headless -p, Stop hook, usage parse)
@governor/adapter-codex Codex adapter (exec, reuse verifier.toml/critic.toml)
@governor/testkit Contract fixtures, exit-code simulator, progress_hash golden values
@governor/cli The govern binary: init / audit / run

Contract schemas pinned in M0

These are the standard-defining surfaces; they are frozen at v1 to claim the definition:

  • @governor/contractloop-contract.yaml
  • @governor/statestate.toml + Status FSM
  • @governor/verifierstop_proof v1

M0 gate (go / no-go)

This project ships as scaffolding for two falsifiable experiments, not a product. Proceed to implement runtime logic only if:

  1. You have personally been bitten by a runaway loop (cost / state drift / same-family verifier misjudgement) in the last 30 days — or user interviews say the pain is "loops run away" not "we have no loop tool".
  2. govern audit finds ≥3 real governance problems across 10 real open-source repos.
  3. True-bill calibration (costUSD vs vendor invoice) is reachable; else govern run is cut and only govern audit ships.

If any gate fails, Governor collapses to a ~50-line audit CI gate, or is abandoned in favour of while :; do cat PROMPT.md | claude; done + native /loop + a budget env var.

License

MIT.

About

loop engineering

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors