fix: high/pre-mainnet issues — query scoping (#184/#675), curator gate (#757), skill ACL (#462), async honesty (#1013)#1132
Merged
Conversation
…L, async honesty) All verified on a live 6-node devnet (real chain, real data) and/or unit tests. #184 + #675 (packages/query) — view-based routing now scopes to / includes sub-graphs. resolveViewGraphs threads a `subGraphName` segment into the per-layer prefixes; queryWithView fans out over registered sub-graphs when none is named. Removed the "deferred to V10.x" throw. Devnet: WM view returns root+subgraph; view+subGraphName scopes to the subgraph. #757 (packages/agent, packages/cli) — GET /join-requests is now curator-gated server-side (listPendingJoinRequests calls assertContextGraphOwner; the route maps the owner failure to 403, mirroring approve/reject). Devnet: non-curator token -> 403, curator -> 200. #462 (packages/agent, packages/cli) — skill_request authorization. Added a SkillAclCheck hook to MessageHandler + setSkillAcl on DKGAgent; the daemon installs a default-deny-for-remote-peers policy (opt back in via messaging.openSkills / messaging.skillAllowedPeers). Closes "any connected peer could invoke any registered skill". Devnet: remote skill_request default-denied. #1013 (packages/publisher) — async publish honesty. A private publish that couldn't reach its chain-registered CG (no collectable storage ACKs) no longer reports `finalized` with a provisional UAL — it fails honestly (data still staged locally under the provisional UAL, surfaced in the error). Threaded `localChainSkipReason` through PublishResult so a genuine no-chain publish still finalizes(local). (Reaching chain for private CGs is #1121.) Unit: async-lift-local-finalization-honesty.test.ts (4/4). Regression: query 262/262, publisher 1169/1169, agent e2e-network 11/11. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The async lift rewrote caller-provided root IRIs to a generated `dkg:<cg>:<ns>:<scope>/<name>-<hash>` form, while the synchronous publish path (`canonicalPublishPayload` → `skolemizeByEntity`) keeps the caller's `rootEntity` IRIs verbatim. The divergence broke stable IRI linking: the same domain payload produced different RDF subjects depending on sync vs async, so VM graphs rendered disconnected and cross-entity references couldn't be followed. Fix: `canonicalRootIri` is now identity — the async lift preserves the caller root IRI exactly like sync. Every downstream consumer reads `validation.canonicalRootMap` symmetrically, so an identity map propagates cleanly: quad rewriting is a no-op, private data is stored at the caller root, and the canonical-vs-source `privateDataAnchor` bridge (an async-only artifact of the old rewrite that sync never created) is correctly skipped (`stampCanonicalAnchorsInWorkspace` becomes a no-op via its `canonical === sourceRoot` guard). Tests: - New `async-lift-canonicalization-parity.test.ts` asserts identity canonicalization and that cross-entity IRI links survive validation. - The two CREATE-remainder subtraction tests now seed authoritative VM state through a SEPARATE publisher instance. Rule-4 entity exclusivity is tracked per-process in memory (never hydrated from the store), so this models the real cross-node idempotency case the subtraction guards (node A finalized R; node B shares R and lifts a CREATE, and subtraction drops the already-finalized quads) — instead of relying on the old root rewrite to dodge the same-instance collision. Verified on a live devnet node: async publish (publishFromFinalizedAssertion, vm-confirmed, real on-chain tx) writes the verbatim caller IRI `urn:dmaast:device:async-1122` into `_verifiable_memory`, with `_meta` `entity` and `canonicalRootMap` both the caller IRI and zero `dkg:` rewrite. Publisher suite green (1171 passed). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Codex red 1 — listPendingJoinRequests callers (GH #757 follow-up): - notifications route now passes the token-verified caller into the curator-gated listPendingJoinRequests; without it the gate resolved against the node default agent and silently emptied the pending-join set for any non-default curator. - isCallerOrNodeOwner compares EVM-address DIDs case-insensitively (EIP-55 checksums are display-only; owner DIDs are stored as written while HTTP callers pass lowercased addresses). Peer IDs stay exact. - Tests: notifications route asserts the caller is threaded (non-default curator keeps its pending joins); agent test proves lowercased curator is accepted and non-curators/default-agent stay rejected. Codex red 2 — by-name WM read is now sub-graph aware (GH #184 follow-up): - resolveViewGraphs threads opts.subGraphName into the single-graph contextGraphLayerUri/contextGraphAssertionUri (mirrors the writer, DKGPublisher.wmGraphUri). - resolveWorkingMemoryKaNumber keys the dkg:kaId lookup by the sub-graph-aware lifecycle URN (root _meta, like assertionFinalize). - Tests: per-KA sub-graph read via kaId stamp, legacy name-keyed fallback, and no root-assertion leak into sub-graph reads. Also: - messaging-chat-acl.test.ts: 6 new GH #462 skill_request ACL tests (real Ed25519/X25519 round-trips): deny blocks handler, throw fails closed, denial precedes unknown-skill resolution (no existence oracle), accept/null restore paths. - publisher-route-snapshot.test.ts: align with GH #1122 caller-IRI parity (payload carries verbatim caller subject, no dkg:<cg>: rewrite). All verified on a live 6-node devnet: #184/#675/by-name (3-way PASS), #757 four-way PASS incl. non-default-curator notifications, #1122 sync/async VM parity (adjacent KAs, verbatim caller IRIs), #1013 EPCIS private capture fails honestly with the #1013 error instead of fake finalization. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
CI on the #1122 commit surfaced three more test sites that encoded the OLD async-lift behavior: - agent/swm-snapshot-sync: prepared lift payload now carries the verbatim caller subjects (root + skolem child), not the dkg:<cg>:… rewrite. - agent/publish-jsonld (async subtraction observe): seed the confirmed authoritative state at the CALLER root — under identity canonicalization that IS the canonical root; the seal short-circuit math is unchanged. - kafka-plugin e2e: create the test CGs local-only (register: false). The plugin registers streams as fully-private KAs and the harness is one isolated daemon — a chain-registered CG can never reach the private-ACK quorum there, so the old run only passed through the fake local finalization #1013 removed. A chainless CG finalizes locally as an honest terminal state, preserving the register→list→get coverage. Local runs: agent swm-snapshot-sync + publish-jsonld 36/36, kafka-plugin e2e 11/11. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Bojan131
added a commit
that referenced
this pull request
Jun 12, 2026
… red-while-live convention Adds single-process / single-Hardhat-node reproducing tests (run in the normal CI lanes) for five high/pre-mainnet issues that were previously only documented `it.skip` stubs, and switches the whole liveness suite to the standard "red while the bug is live, green when fixed" convention (plain failing `it()` instead of the inverted `it.fails`, which was green-while-broken). New CI tests (each authored against a known-fixed build and confirmed to flip green there, so a red is a genuine live bug, not a broken test): - #462 agent/issue-462-skill-acl.test.ts — an unauthorized (but signed) peer's skill_request is rejected and the handler does not run. Today there is no ACL → handler runs → RED. - #936 agent/issue-936-tokenid-determinism.test.ts — two replicas reconciling the same multi-root KC from chain (divergent oxigraph insertion orders) agree on the rootEntity→tokenId mapping. Today positional assignment over a store-dependent order makes them disagree → RED. - #1013 publisher/issue-1013-async-finalization-honesty.test.ts — a private publish that never reached chain (no storage ACKs) must NOT map to a finalized lift job. Today the mapper returns finalized/local → RED. - #1078 storage/issue-1078-private-layer-scope.test.ts — a root hydrates only the authoritative private slice, not a superseded commitment for the same root. Today the CG-level _private graph commingles both → RED. - #1091 random-sampling/e2e-hardhat-chain.test.ts — a node cannot predict its own RS challenge from public block data. Today the seed is reconstructed from block.difficulty/blockhash/sender and previewChallengeForSeed predicts the exact on-chain draw → RED. Convention flip (it.fails → it()) for the existing high-issue repros (#11, #184, #675, #757, #1121, #1122 + the devnet multi-node tier) so the suite is uniformly RED while bugs are live and GREEN once fixed — matching how the fix PRs (#1107, #1132) turn individual tests green as they merge. Doc rewritten (docs/testing/ISSUE_LIVENESS_TESTS.md): all 25 high issues mapped to a test across three tiers — 11 CI unit/integration, 8 devnet multi-node, and 6 honest pending-fixture/emergent stubs (#614 #1099 #1124 fixture-needed; #723 #999 #1008 emergent/load — a deterministic CI assertion there would be a false positive). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
9 tasks
…issues # Conflicts: # packages/cli/test/notifications-route.test.ts # packages/query/src/dkg-query-engine.ts
…test The merge of main into this branch produced 2 conflicts (resolved in the merge commit) plus one auto-merge artifact this commit fixes. Conflict resolutions (in the merge commit): - packages/query/src/dkg-query-engine.ts (working-memory view): combined main's same-identity alias span (PR #1107 review 🟡 — one prefix per agentAddressAlias) with #1132's sub-graph scoping (#184/#675 — the `${sg}` suffix), so the WM prefixes are `…${sg}/_working_memory/${addr}/` per alias. - packages/cli/test/notifications-route.test.ts: took main's version. main rewrote it from a mock-based unit test into a real-daemon integration test (sign-join→request-join→curator reject-join) that fully exercises the #757 curator gate; the PR's #757 route change (thread callerAddress into listPendingJoinRequests) auto-merged into notifications.ts and is covered. This commit: - packages/agent/test/messaging-chat-acl.test.ts: add `vi` to the vitest import. main's copy of this file (chat-ACL tests only) and the PR's net-new `skill_request ACL (GH #462)` describe block (which uses `vi.fn` via echoSkill) auto-merged, but the surviving import line was main's `{ describe, it, expect }` — so the #462 skill-ACL tests threw `ReferenceError: vi is not defined`. The #462 feature itself (MessageHandler.setSkillAcl + default-deny enforcement) is intact in the merged messaging.ts; only the test import needed reconciling. 15/15 pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…on tests)
Addresses the unresolved 🔴 Codex review findings, re-assessed against the
current code (post 225-commit main merge). Four were real and present; three
were already-fixed or not-applicable (resolved on the PR with explanation).
REAL BUGS FIXED:
- query VM view drops the sub-graph ROOT graph (dkg-query-engine.ts): the
verifiable-memory case returned `graphs: []` when subGraphName was set, so
it only searched `…/{sub}/_verifiable_memory/*` and missed confirmed /
intentional-local sub-graph data written to `did:dkg:context-graph:{cg}/{sub}`.
Now includes the sub-graph root graph, mirroring the root-CG branch.
- query sub-graph fan-out ignored `verifiedGraph` (dkg-query-engine.ts): a
single-graph `view:'verifiable-memory' + verifiedGraph` read still fanned out
across every registered sub-graph's VM partition and returned unrelated rows.
Skip the fan-out when verifiedGraph is set (it is already pinned to one graph).
- async-lift `private-no-acks` retried forever (lift-job-failures.ts,
async-lift-publish-result.ts): a deterministic "private payload had no
collectable storage ACKs" broadcast failure was classified as the default
retryable `rpc_unavailable`, so the queue reset/retried a job that can never
finalize until #1121. Added a terminal failure code `private_unanchorable`
(broadcast / terminal / fail_job) and classify the message to it.
- mixed public+private async root lost its `dkg:privateDataAnchor`
(dkg-agent-helpers.ts): `partitionPublishAsyncQuads` only anchored
private-ONLY roots, and after the #1122 canonicalRootIri→identity flip
`stampCanonicalAnchorsInWorkspace` self-disabled — so mixed roots' private
data disappeared from EPCIS/Kafka partition readers (which bridge
public→private via the anchor). Now anchors every privately-staged root
(idempotent).
ALREADY-FIXED / NOT-APPLICABLE (resolved on the PR, no code change):
- WM by-name path sub-graph awareness — already fixed in d65e09f
(resolveViewGraphs + resolveWorkingMemoryKaNumber thread subGraphName).
- listPendingJoinRequests caller threading — every production call site
(notifications + context-graph routes) passes callerAgentAddress; omitting it
throws loudly (intended secure gate), never silently empties the set.
- canonicalRootIri identity legacy-migration — N/A pre-mainnet: the async-lift
publisher + the old generated-root rewrite are both unreleased, so no store
persisted the legacy `dkg:<cg>:…` root format to mismatch on upgrade.
Regression tests added: query sub-graph VM scoping (A,C), async-lift terminal
classification (D), partitionPublishAsyncQuads anchor coverage (F).
query 277, publisher 1177, agent 1612 — all green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This was referenced Jun 18, 2026
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes for high / pre-mainnet issues, branched from a fresh
main. Every fix in this PR was verified on a live 6-node devnet (real chain, real data) and/or a unit test — no mocks for the behaviour under test.This is the first batch of the 25-high effort. The other 9 highs are fixed on PR #1107 (
fix-in-flight); this PR adds 5 more. The remaining deep contract/consensus/storage/P2P issues are tracked separately (see "Not in this PR" below) — they need focused, reviewed PRs rather than being rushed in here.Fixes (5) — all verified
#184 + #675 — sub-graph scoping under view-based routing (
packages/query)view: working-memory(and SWM/VM) now includes data in registered sub-graphs (was silently excluded), andview+subGraphNamenow scopes to that sub-graph instead of throwingdeferred to V10.x.resolveViewGraphsthreads a/{sub}segment into every per-layer prefix;queryWithViewfans out across registered sub-graphs (from the_metaregistry) when none is named.view+subGraphNamereturns only the sub-graph's. Unit:sub-graph-query.test.ts. Query suite 262/262.#757 — curator-gate the join-request endpoints (
packages/agent,packages/cli)listPendingJoinRequestsnow callsassertContextGraphOwner(the same check approve/reject already use), and the GET route maps the owner failure to 403. Closes "any valid token can read another curator's pending-moderation data".#462 —
skill_requestauthorization (packages/agent,packages/cli)PROTOCOL_MESSAGEauthenticated the caller but did no authorization — any connected peer could invoke any registered skill. Added aSkillAclCheckhook toMessageHandler+agent.setSkillAcl(...); the daemon installs a default-deny-for-remote-peers policy. Operators opt back in withmessaging.openSkills: trueormessaging.skillAllowedPeers: [...].skill_requestis default-denied with a clear reason. Agent e2e-network 11/11 (chat unaffected).#1013 — async publish on-chain honesty (
packages/publisher)publishAsyncthat couldn't reach its chain-registered CG (no collectable storage ACKs) no longer reportsfinalizedwith a provisionalt…UAL. It now fails honestly (the data is still staged locally under the provisional UAL, surfaced in the error). ThreadedlocalChainSkipReasonthroughPublishResultso a genuine no-chain publish stillfinalizes(local).async-lift-local-finalization-honesty.test.ts(4/4). Publisher suite 1169/1169. (Actually reaching chain for private CGs is publishAsync should support encrypted VM publishing for curated/private context graphs #1121.)Not in this PR (honest status)
These highs need a focused, reviewed PR — rushing them risks exactly the pre-mainnet breakage we're avoiding, and I won't claim verification I can't stand behind:
oxigraph-serverdefault + a request-timeout posture.🤖 Generated with Claude Code