Ouroforge

Ouroforge is a local-first, evidence-native prototype for game-authoring loops. It turns a declared goal into a local run, captures evidence from the runtime, records what happened, and proposes the next change without giving agents or browser surfaces trusted write authority.

The name is Ouroboros (the serpent that feeds on its own tail) + Forge. The loop is intentionally inspectable:

Seed → Run → Evidence → Evaluation → Journal → Mutation → (back to Seed)

Status: pre-release private MVP with public-readiness and public-alpha launch-governance evidence recorded for a future manual visibility review. It runs one reproducible local demo today. Ouroforge is not a Godot replacement and makes no compatibility promises — treat it as an inspectable prototype, not a public launch or support commitment.

What is Ouroforge?

A Seed describes intent and acceptance criteria. Ouroforge runs that intent locally, captures runtime evidence, renders a deterministic verdict, journals the result, and records mutation proposals when the run falls short. The long-term ambition is agentic game authoring where AI can suggest changes, but evidence and review decide which changes become real.

The current MVP is useful as a reproducible local demo and contract suite for the loop. It is not a hosted service, production editor, release pipeline, or broad engine replacement.

What works today

Current checked-in behavior includes:

Seed validation and local run execution for seeds/platformer.yaml.
Local project validation and minimal 2D project scaffolding.
Generated run evidence under runs/ with ledger, journal, evaluator, mutation, comparison, and dashboard read models.
A minimal browser runtime/probe path driven through local Chrome/Chromium.
Read-only static dashboard and authoring cockpit surfaces over exported JSON.
Fixture-backed contracts for scene, asset, tilemap, source-preview, sandbox, review, and public-readiness documentation boundaries.
Source Mutation Preview v1 is complete as inert preview/review/sandbox evidence only; source patch apply to the trusted maintainer worktree remains unimplemented and forbidden until a separate later governance gate authorizes it.
3D Capability Gate v1 is complete as bounded local 3D evidence: scene graph, camera/projection, mesh/material refs, render smoke, collision/trigger, animation, probe/evaluator compatibility, deterministic demo/regression fixtures, normalized dashboard read models, and escaped read-only Studio inspection. It is not production 3D readiness, broad 3D compatibility, native export, plugin runtime, hosted/cloud behavior, or a Godot replacement claim.

Generated run, dashboard, screenshot, sandbox, and local tool artifacts are local state and stay untracked unless a future issue explicitly scopes a deterministic fixture.

Quickstart

Prerequisites

Install Rust + Cargo, Node.js, Python 3, and Chrome/Chromium at a standard path (or set OUROFORGE_CHROME=/path/to/chrome). No Playwright, database, cloud service, account system, or hosted runtime is required.

Run the local checks and demo

cargo fmt --check
cargo test
cargo run -p ouroforge-cli -- seed validate seeds/platformer.yaml
cargo run -p ouroforge-cli -- project validate examples/project-workspace-fixtures/valid
cargo run -p ouroforge-cli -- project init .omx/tmp/project-scaffold-smoke --template minimal-2d
cargo run -p ouroforge-cli -- run .omx/tmp/project-scaffold-smoke/seeds/platformer.yaml \
    --project .omx/tmp/project-scaffold-smoke --scenario-pack smoke --workers 1
rm -rf .omx/tmp/project-scaffold-smoke
cargo run -p ouroforge-cli -- run seeds/platformer.yaml --workers 4

The run command prints a run directory such as runs/run-.... Project-bound runs add optional project context to run.json, ledger, journal, and dashboard export; legacy runs without --project stay compatible. Generated run artifacts are intentionally git-ignored.

Inspect evidence from a run

cargo run -p ouroforge-cli -- evidence list runs/<run-id>
cargo run -p ouroforge-cli -- journal show runs/<run-id>
cargo run -p ouroforge-cli -- mutation list runs/<run-id>
cargo run -p ouroforge-cli -- compare runs/<run-id> runs/<run-id>

Open the read-only demo surfaces

cargo run -p ouroforge-cli -- dashboard export \
    --runs-root runs --output examples/evidence-dashboard/dashboard-data.json
python3 -m http.server 8000 --bind 127.0.0.1 --directory .

Evidence dashboard: http://127.0.0.1:8000/examples/evidence-dashboard/
Authoring cockpit: http://127.0.0.1:8000/examples/authoring-cockpit/
Runtime demo: http://127.0.0.1:8000/examples/game-runtime/

The current quickstart command audit is recorded in docs/fresh-clone-onboarding-command-audit-v1.md. For an isolated fresh-clone-style smoke, run scripts/fresh-clone-smoke.sh as documented in docs/fresh-clone-smoke-v1.md. Troubleshooting and cleanup guidance lives in docs/fresh-clone-troubleshooting-cleanup-v1.md. These notes clarify expected generated state and cleanup boundaries without changing repository visibility, release status, or trusted-write authority.

Core loop

Ouroforge's loop is built around evidence over assertion:

Seed — declare intent and acceptance criteria.
Run — execute a local runtime/demo path and collect generated artifacts.
Evidence — capture bounded runtime, browser, project, scenario, and probe outputs as inspectable files.
Evaluation — produce a deterministic verdict from the evidence.
Journal — summarize what actually happened with evidence references.
Mutation proposal — record proposed next changes as reviewable data, not trusted source writes.
Repeat — a later reviewed change can become the next seed/run cycle.

The Rust core and local filesystem own trusted state. Agents, browser workers, and Chrome DevTools Protocol observations are evidence inputs only.

Demos and examples

seeds/platformer.yaml — the MVP seed used by the local demo.
examples/game-runtime/ and examples/runtime-probe/ — minimal local runtime and probe pages.
examples/evidence-dashboard/ — read-only evidence dashboard over exported dashboard JSON.
examples/authoring-cockpit/ — read-only authoring cockpit over generated evidence and proposal data.
examples/*-v1, examples/*-v2, and examples/*-regression — milestone fixtures, scenario packs, and evidence smokes.

Safety model

Ouroforge's current safety boundary is conservative:

Trusted authority: Rust CLI/core code and the local filesystem.
Evidence only: agents, browser workers, and CDP observations can inform proposals but cannot apply them.
Read-only browser surfaces: dashboard and cockpit pages render exported JSON and copyable commands; they do not write files, run commands, or accept source mutations.
No command bridge: browser/UI surfaces do not invoke local commands or local server command bridges.
No source apply authority: source-preview, sandbox, stale-target, rollback, and review artifacts are evidence/governance boundaries unless a later explicit issue authorizes trusted apply.
Generated-state isolation: runs/, target/, dashboard exports, .omx/, .omc/, .openchrome/, .claude/, and sandbox outputs remain local ignored state.

Security and trust-boundary references:

Non-goals and maturity boundaries

Ouroforge does not currently provide:

hosted/cloud execution, accounts, authentication, authorization, or multi-tenant behavior;
production readiness, support/security SLA, compatibility stability, or secure sandboxing for arbitrary untrusted content;
native export, packaging, signing, publishing, deployment, or release automation;
plugin runtime, marketplace, visual scripting, or third-party code-loading ecosystem;
browser trusted writes, local command bridges, auto-apply, auto-merge, or reviewer bypass;
source patch apply to the trusted maintainer worktree.

Public release still requires fresh evidence gates in docs/public-readiness-audit.md, docs/public-launch-checklist.md, and the manual visibility-decision process. The launch-governance and communication-pack docs are preparation artifacts, not a visibility toggle or publication event.

Roadmap

The roadmap and per-milestone completion records — with each milestone's evidence chain and non-goals — live in docs/roadmap.md. Cross-cutting boundaries are in Non-goals and maturity boundaries; they are not repeated per milestone here. Earlier completed milestones — including Safe Source Mutation Apply, the GDD-to-Playable Prototype v1 prototype lane, the Plugin / Extension System v1 lane, the Full Studio Editor lane, the Godot-Plus Demo lane, and the Autonomous QA / Playtest Swarm v1 lane — keep their full evidence chain and per-issue records in docs/roadmap.md; only the current Era's snapshot is summarised below.

Current state. Era H (Milestones 42–46) is recorded complete on merged evidence in docs/roadmap.md: Multi-Agent Production Pipeline v1 (M42), Autonomous Producer and Whole-Game Orchestration v1 (M43), Scaled Trust Gradient / Release Provenance / Compliance v1 (M44), the Shipping and LiveOps Layer-3 Re-evaluation Design Gate (M45), and the Era H closing autonomy assessment (M46). The descriptive autonomy posture is in docs/era-h-autonomy-assessment.md: agents and local Rust contracts can carry proposal, evidence, orchestration, QA, provenance, and release-candidate preparation work, but vision, taste/fun, legal compliance acceptance, and release go/no-go remain human decisions.

Earlier foundations remain recorded. Era E established bounded local trust and Layer-3 DEFER in docs/layer3-reevaluation-v1.md; Era F/G added genre/function evidence and specialized production gates. The #1 and #23 anchors are deliberately kept open as ongoing north-star tracks. The full per-era completion history and evidence chains live in docs/roadmap.md and the matching docs/*.md contracts.

Current frontier. Era J (Milestones 57-60) is complete on merged evidence as a bounded human creative/release-judgment track over the existing deckbuilder substrate: candidate generation and curation, human playtest/fun-feel capture, narrative/theme proposal assistance, human-approved balance recommendations, and release-readiness go/no-go evidence. The closing assessment is recorded in docs/era-j-creative-leverage-assessment.md: Ouroforge increases proposal/evidence output per human decision, but the permanent human core remains fun, taste, tone/soul, curation, balance approval, release go/no-go, and market judgment. This is not automated creativity, an automated fun/quality verdict, release authority, production readiness, or a Godot replacement/parity claim.

Next. Later work requires separate issue-scoped design gates. Shipping/native-store release actions, hosted/cloud, real-player telemetry, live balancing, update/patch pipelines, market demand, and distributed Layer-3 behavior remain DEFER absent a separate #1508 Layer-3 GO; Rust-first / local-first is preserved absent that GO.

Contributor guide

Contribution workflow and review expectations: CONTRIBUTING.md
Security policy and vulnerability reporting: SECURITY.md
License: LICENSE

Before opening a PR, run:

cargo fmt --check
cargo test
cargo clippy --all-targets --all-features -- -D warnings
node --check examples/evidence-dashboard/dashboard.js && node examples/evidence-dashboard/dashboard.test.cjs
node --check examples/authoring-cockpit/cockpit.js && node examples/authoring-cockpit/cockpit.test.cjs

Per-milestone evidence steps live in the matching docs/*.md files. Keep generated/local runtime state untracked.

Documentation map

Use docs/README.md as the expanded documentation index. The README keeps only the most common starting points so public-alpha readers do not have to scan every milestone contract first.

Reader question	Start here
How does the loop work in detail?	`docs/architecture.md`
What is complete and what is next?	`docs/roadmap.md`
What is the trust boundary?	`docs/README.md#safetytrust-boundaries`
Where are milestone references grouped?	`docs/README.md`
What wording is forbidden or risky?	`docs/public-wording-guardrail-v1.md`, `docs/public-wording-audit-process-v1.md`
Where is the final docs IA audit?	`docs/docs-link-wording-audit-pa1.5.3.md`

Repository map

crates/ouroforge-core — trusted core models and evidence APIs for seeds, runs, ledgers, browser smoke, scenarios, evaluator, journal, mutation proposals, project/scene contracts, source-preview boundaries, and dashboard read models.
crates/ouroforge-cli — CLI entrypoints for seed, run, evidence, journal, mutation, dashboard, scene, project, source-preview, and related commands.
seeds/ — MVP seed examples.
examples/ — runtime demos, read-only UIs, fixtures, scenario packs, and regression examples.
docs/ — architecture, roadmap, trust-boundary/evidence contracts, milestone notes, public-readiness audits, and governance handoff docs.

Generated local state

Do not commit generated or local runtime/tool state: runs/, target/, examples/evidence-dashboard/dashboard-data.json, dashboard-data/, sandbox/, .claude/, .openchrome/, .omc/, .omx/. See docs/artifact-write-policy-v1.md for the trusted-write categories and generated-output/source-like collision policy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ouroforge

What is Ouroforge?

What works today

Quickstart

Prerequisites

Run the local checks and demo

Inspect evidence from a run

Open the read-only demo surfaces

Core loop

Demos and examples

Safety model

Non-goals and maturity boundaries

Roadmap

Contributor guide

Documentation map

Repository map

Generated local state

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1,739 Commits
.github		.github
.omx/dogfood-validation		.omx/dogfood-validation
ci		ci
crates		crates
docs		docs
examples		examples
scripts		scripts
seeds		seeds
studio/executor		studio/executor
tools/live-observability-runner		tools/live-observability-runner
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md

Folders and files

Latest commit

History

Repository files navigation

Ouroforge

What is Ouroforge?

What works today

Quickstart

Prerequisites

Run the local checks and demo

Inspect evidence from a run

Open the read-only demo surfaces

Core loop

Demos and examples

Safety model

Non-goals and maturity boundaries

Roadmap

Contributor guide

Documentation map

Repository map

Generated local state

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages