FROST/ROAST readiness branch#3866
Conversation
## Summary - cut over the `frost_tbtc_signer` bootstrap path to return coarse tbtc-signer signature output on successful `RunDKG -> StartSignRound -> FinalizeSignRound` - keep legacy signing fallback only for verified coarse-path failures (bridge errors, decode failures, or structural divergence) - wire `BuildTaprootTx` through the transitional native tbtc-signer orchestration path - gate `BuildTaprootTx` signing substitution on strict native-vs-legacy transaction input/output equivalence checks - add coarse success/fallback telemetry and observer-registration guards - expand unit and integration coverage for coarse cutover, retry/attempt-variation behavior, and `BuildTaprootTx` substitution safety ## Stack Context - base branch: `feat/frost-schnorr-migration-scaffold` (`#3866`) - recommended review order: 1. review `#3866` for scaffold/runtime seams 2. review this PR as the cutover + hardening delta ## Review Guide (hot paths) - coarse cutover + fallback semantics: - `pkg/frost/signing/native_ffi_primitive_transitional_frost_native.go` - `pkg/frost/signing/native_frost_engine_tbtc_signer_registration_frost_native.go` - `BuildTaprootTx` wiring and substitution gating: - `pkg/tbtc/wallet.go` - `pkg/tbtc/native_tbtc_signer_build_taproot_tx_frost_native_tbtc_signer.go` - `pkg/bitcoin/transaction_builder.go` - coverage for tx assembly/substitution and bridge safety: - `pkg/tbtc/wallet_sign_transaction_build_taproot_tx_test.go` - `pkg/bitcoin/transaction_builder_test.go` - `pkg/frost/signing/native_frost_engine_tbtc_signer_registration_frost_native_test.go` ## Scope Boundaries - in scope: bootstrap/coarse-path cutover hardening and safe `BuildTaprootTx` integration - out of scope: full production signer-runtime replacement and later migration phase gates
|
Readiness evidence update for the tBTC Schnorr FROST/ROAST migration stack, 2026-05-20. From a clean worktree at PR head go test -timeout 20m -tags 'integration frost_native frost_tbtc_signer' ./pkg/frost/... ./pkg/tbtcObserved package coverage:
This narrows the keep-core evidence gap for FROST/tBTC focused package behavior, but it is not a production-readiness substitute for full keep-core integration/testnet coverage. The following remain open blockers for the tBTC FROST/ROAST readiness gate:
The corresponding tBTC evidence docs were pushed in tlabs-xyz/tbtc#402. |
… at init (#3958) ## Summary Addresses three FFI-safety findings from an independent review of #3866: - **H3 (init-time panic)**: `RegisterNativeExecutionFFISigningPrimitiveForBuild` and `registerNativeExecutionAdapterForBuild` (frost_native) panic on registration failure. Both are invoked from `pkg/frost/signing/native_adapter_registration.go`'s package `init()`, so a transient registration failure crashes the binary at startup. Downstream code (`pkg/frost/signing/backend.go`) already returns `ErrNativeCryptographyUnavailable` when no native adapter is registered, so the legacy execution backend remains the safe-by-default path — panicking at init turned a recoverable degradation into an outage. Replace panics with structured `logger.Warnf` plus a package-level `lastRegistrationError` and `LastNativeRegistrationError()` accessor. Callers that want to fail startup on a registration error can opt in by checking that accessor after `RegisterNativeExecutionAdapterForBuild`; default callers continue booting with the legacy backend, exactly as if `frost_native` was never enabled. The existing `TestRegisterNativeExecutionFFISigningPrimitiveForBuild_ProviderErrorPanics` becomes `..._ProviderErrorIsRecordedNotPanicked` and asserts the new behavior. - **M1 (nil ptr free)**: `parseBuildTaggedTBTCSignerResult` unconditionally deferred `C.tbtc_signer_free_buffer(result.buffer.ptr, result.buffer.len)` even when the C wrapper's status-code -1 path returned `result.buffer.ptr == NULL`. The C wrapper checks the `frost_tbtc_free_buffer` symbol for NULL but does not check the buffer pointer, so a future Rust-side change that dereferenced its ptr argument without a NULL guard would crash. Skip the defer when `result.buffer.ptr == nil`. - **M6 (unbounded length)**: `unmarshalSignerMaterialFromPersistence` accepted any uvarint length within the data buffer. A corrupted state file or hostile peer carrying a multi-hundred-MiB envelope would allocate that many bytes before the existing bounds check ran. Cap the format length at 256 bytes and the payload length at 256 KiB — comfortably above any real signer material envelope — and reject earlier with a clear error. New regression tests `TestUnmarshalSignerMaterialFromPersistence_RejectsOversizedFormatLength` and `..._RejectsOversizedPayloadLength`. ## Out of scope (deferred) The remaining placeholder-fencing findings from the same review (H1: \`KeyGroupSource == \"legacy-wallet-pubkey\"\` fallback; H2: DKG placeholder participant pubkeys; H4: silent key-group substitution when source is legacy) require maintainer policy alignment on whether to gate the \`frost_tbtc_signer\` build behind an opt-in flag or refuse-by-default. Not included here. Several MED findings around Bitcoin witness preservation, FROST message channel back-pressure, and replay-error string matching also require behavior decisions and are not included in this safety-hygiene slice. ## Verification Local (GOCACHE under \`/private/tmp\`): - \`go test ./pkg/frost/...\` — PASS - \`go test -tags 'frost_native frost_tbtc_signer' ./pkg/frost/...\` — PASS - \`go test ./pkg/tbtc -run 'TestUnmarshalSignerMaterial|TestMarshalSigner|TestSignerMarshalling|TestFuzzDecodeNativeSignerMaterial'\` — PASS - \`go test -tags 'frost_native frost_tbtc_signer' ./pkg/tbtc -run 'TestConfigureFrostSigningBackend|TestNewNode_ConfiguresFrostSigningBackend|TestSigningExecutor_Sign|TestRegisterSignerMaterialResolverForBuild'\` — PASS - \`go vet ./pkg/frost/... ./pkg/tbtc\` — clean
… message hygiene (#3959) ## Summary Bundles four findings from the independent PR #3866 review that all sit in the same code seam (frost_native scaffold path + receive loops). Stacked on #3958. ### H1+H4 — scaffold key-group must be opt-in (was silently accepted) \`signer_material_resolver_build_frost_native_tbtc_signer.go\` built signer material with \`KeyGroupSource: \"legacy-wallet-pubkey\"\` (a sha256 placeholder, not a DKG output) and the FFI primitive in \`native_ffi_primitive_transitional_frost_native.go\` silently substituted the Rust signer's RunDKG key group when the source was that placeholder. Production deployments with placeholder material would have signed through whatever key group the Rust side returned without operator-facing signal. Add a refuse-by-default opt-in: \`KEEP_CORE_FROST_TBTC_SIGNER_ACCEPT_SCAFFOLD_KEY_GROUP=1\`. The new \`signing.AcceptScaffoldKeyGroupEnabled\` helper is per-call (not cached), so flipping the env unset recovers fail-closed behavior without restart. Both the resolver and the FFI primitive check the flag; both refuse with a clear error that names the env var and the placeholder source. New regression test pins the refuse-by-default path; existing scaffold-using tests opt in via \`t.Setenv\`. ### M2+M3 — Bitcoin witness restoration refuses unsupported shapes \`ReplaceUnsignedTransaction\`'s restoration path handled only single-element previous witnesses (P2WSH redeem script). Multi-element witnesses (P2TR script-path) were silently dropped. Replace with an explicit switch: 0 elements → leave empty, 1 → restore as before, ≥2 → fail loudly. Removes the tautological inner \`len(replacedInput.X) == 0\` checks that the outer refusals already guarantee. New regression test \`TestTransactionBuilder_ReplaceUnsignedTransaction_RejectsMultiElementPreviousWitness\`. ### M5 — first-write-wins on peer messages Three round-message receive loops (tbtc-signer contribution, FROST round one, FROST round two) did last-write-wins, letting a peer mutate its own contribution after first send. Switch to first-write-wins with byte-equal retransmissions idempotent and conflicting retransmissions logged via a new \`protocolLogger\` channel. Three message-equality helpers cover the three message types. ## Out of scope (deferred to separate PRs) - **H2** — DKG placeholder participant pubkeys (\`buildTaggedTBTCSignerDKGPlaceholderPublicKeyHex\`) needs either wiring real \`MembershipValidator\` pubkeys through or fencing under the same env flag. - **M4** — ROAST-compliant bounded transition evidence for the non-blocking message channel. Multi-PR effort. - **M7** — Real ROAST-aware retry replacing the byte-identical tECDSA shuffle in \`pkg/frost/retry/retry.go\`. Multi-PR effort. - **L5** — FFI status-code semantics for replay detection. Paired with a tbtc-signer follow-up. ## Verification Local (GOCACHE under \`/private/tmp\`): - \`go test ./pkg/frost/... ./pkg/bitcoin\` — PASS - \`go test -tags 'frost_native frost_tbtc_signer' ./pkg/frost/... ./pkg/bitcoin\` — PASS - \`go test -tags 'frost_native frost_tbtc_signer' ./pkg/tbtc -run 'TestConfigureFrostSigningBackend|TestNewNode_ConfiguresFrostSigningBackend|TestSigningExecutor_Sign|TestRegisterSignerMaterialResolverForBuild|TestBuildTaggedTBTCSignerRoundKeyGroup|TestBuildTaggedLegacyCompatibleNativeExecutionFFISigningPrimitive|TestTransactionBuilder_ReplaceUnsignedTransaction'\` — PASS
…3962) ## Summary Adds **RFC-21** as the design doc that scopes the M4 (transition evidence) and M7 (ROAST-aware retry) findings from the independent review of #3866 into a single layered design and a phased, PR-sized implementation plan. This PR is **doc-only**. It introduces no behaviour change. Subsequent implementation PRs reference RFC-21 in their descriptions. Stacked on #3961. ## Why one design, not two M4 and M7 share the same notion of *attempt context* and *transition evidence*: - Fixing M4 alone produces evidence that no consumer reads. - Fixing M7 alone gives the consumer nothing to drive retry decisions on. The RFC treats them as one design split into linear phases. ## Phasing - **Phase 0** -- this RFC. - **Phase 1** -- `AttemptContext` type + canonical hash; protocol messages carry attempt-context binding (optional during migration). - **Phase 2** -- receiver overflow tracking (M4 layer A) plumbed through the three `select { default }` drop sites, default no-op. - **Phase 3** -- coordinator state machine: `BeginAttempt`, `RecordEvidence`, `NextAttempt`. Deterministic `(AttemptContext, TransitionEvidence) -> AttemptContext` map. - **Phase 4** -- wire receiver to coordinator behind `frost_roast_retry` build tag. - **Phase 5** -- retry adapter + `EvaluateRoastRetryForSigning`; migrate first call site behind the build tag with readiness-gate guard. - **Phase 6** -- migrate remaining call sites; delete the byte-identical-to-tECDSA shuffle once unused. - **Phase 7** -- flip the readiness manifest to `present` once Phase 6 ships and integration tests run against a real testnet (only then; no early flip). ## Open questions called out explicitly The RFC lists four open design questions that need cross-team review before Phase 3 lands: 1. Cross-process coordinator agreement -- gossip topic choice. 2. Persistence across signer restart. 3. FFI surface (Rust signer error-code style; follows the L5 pattern from #425 / #3961). 4. Backward-compat horizon for the `AttemptContextHash` field. ## Out of scope - DKG retry (separate RFC). - Bitcoin transaction-builder changes. - Operator UX changes (CLI, dashboards) -- land alongside Phase 5/6. - Cross-domain ROAST between keep-core and tbtc-signer. ## Test plan - [ ] Reviewer reads RFC end-to-end. - [ ] Reviewer flags any phase that should be split further or reordered before Phase 1 begins. - [ ] Reviewer answers the four open questions or marks them defer-to-Phase-3. No code change in this PR, so no CI test run is meaningful beyond asciidoc rendering.
…ild tag (#3965) ## Summary Forward-fix for #3866 CI: the Phase 1B binding file and test referenced message types defined in \`//go:build frost_native\` files but were themselves untagged. Untagged staticcheck on the integration branch (#3866) then reported \`undefined: nativeFROSTRoundOneCommitmentMessage\` and the client-lint job failed. Adds \`//go:build frost_native\` to: - \`pkg/frost/signing/attempt_context_binding.go\` - \`pkg/frost/signing/attempt_context_binding_test.go\` The helpers and tests are only exercised by gated code paths (the three message-type methods all live behind \`frost_native\`), so the build tag is the right locus. ## Why now PRs #3963 (Phase 1A) and #3964 (Phase 1B) were merged into the \`feat/frost-schnorr-migration-scaffold\` branch before #3866's integration CI ran. Once the merges landed, #3866's \`client-lint\` job rebuilt under the untagged staticcheck pass and exposed the missing tag. This PR is the smallest possible fix. ## Verification Locally with module-pinned staticcheck 2025.1.1: \`\`\` go build ./... go build -tags 'frost_native frost_tbtc_signer' ./pkg/frost/... go test -tags 'frost_native frost_tbtc_signer' ./pkg/frost/signing/ staticcheck -checks \"-SA1019\" ./... # whole repo, silent staticcheck -checks \"-SA1019\" ./pkg/frost/signing # silent \`\`\` ## Test plan - [ ] CI green: client-lint, client-vet, client-scan, client-build-test-publish all pass. - [ ] #3866 lint job recovers once this merges into \`feat/frost-schnorr-migration-scaffold\`.
…3988) ## Summary Closes the **M4 gap** from the original PR #3866 review by adding the two evidence categories the RFC-21 Phase-2 work left as future work: **validation-rejection evidence** and **first-write-wins-conflict evidence**. With this PR, the \`NextAttempt\` policy can permanently exclude misbehaving peers on all four ROAST blame channels -- transport-overflow, validation-reject, equivocation-conflict, and silence -- instead of just overflow + silence. ## Why this matters A peer that only sends **malformed messages** (validation rejects, never overflows the channel) was previously indistinguishable from a silent peer. The transient silence-parking policy would bench-and-reinstate them indefinitely, never permanently excluding the malicious behaviour. Same for a peer **equivocating mid-attempt**: the existing first-write-wins assembly correctly dropped the conflicting retransmission but only logged the event -- the bundle carried no structured evidence the coordinator's policy could act on. ## What lands ### Recorder API | Surface | Notes | |---|---| | \`RecordReject(sender, reason)\` | reason captured verbatim; per-reason quota counter | | \`RecordConflict(sender)\` | saturates at conflict quota | | \`RejectQuotaDefault = 8\`, \`ConflictQuotaDefault = 4\` | matches RFC-21 Layer A categoryQuota | | Per-reason quotas independent | peer cannot saturate one reason to mask another | ### Wire types | Type | Sort order | Cap | |---|---|---| | \`RejectEntry{Sender, Reason, Count}\` | asc by Sender, then asc by Reason | per-attempt evidence size bounded by Σ quotas | | \`ConflictEntry{Sender, Count}\` | asc by Sender | per-attempt evidence size bounded by Σ quotas | Both fields use \`omitempty\` so pre-PR snapshots round-trip without the new fields. \`Validate()\` enforces sorted-ascending invariants. ### NextAttempt policy | Threshold | Value | Source | |---|---|---| | \`RejectExclusionThreshold\` | 1 | RFC-21 Layer B ("any non-transport reject is sufficient cause") | | \`ConflictExclusionThreshold\` | 1 | A single conflict is byzantine evidence | \`computeNextAttempt\` merges \`overflowBlamed\`, \`rejectBlamed\`, \`conflictBlamed\` into the permanent ExcludedSet. The \`blamedSenders\` helper is factored out so all three categories share the deterministic sort + threshold-comparison logic. ### Receive-loop wiring Three reject sites and three conflict sites updated across the two files that house the three FROST/tbtc-signer receive loops: | Site | Was | Now | |---|---|---| | \`shouldAcceptNativeFROSTMessage\` returns false | silent drop | \`evidence.RecordReject(senderID, "validation_gate_rejected")\` + drop | | First-write-wins conflict in assembly loop | warn log only | \`evidence.RecordConflict(senderID)\` + warn log | ## Test coverage (15 new cases) - 7 recorder tests: accumulation, per-reason quota saturation, per-reason independence, conflict saturation, all-categories-present, NoOp-inert, RFC-constant assertions - 5 policy tests: single reject excludes, single conflict excludes, reject+conflict on different senders, empty evidence (sanity), threshold-constant assertions - Receive-loop wiring is covered indirectly by the recorder unit tests; the NoOp default keeps pre-RFC-21 receive semantics observably unchanged so no integration-level test is required. ## Verification | Command | Result | |---|---| | \`go build ./...\` + \`go build -tags 'frost_native frost_tbtc_signer frost_roast_retry' ./...\` | both clean | | \`go test ./pkg/frost/...\` + race | pass | | \`go test -tags 'frost_native frost_tbtc_signer frost_roast_retry' ./pkg/frost/...\` | pass (5 packages) | | \`staticcheck -checks '-SA1019' ./pkg/frost/...\` | silent | | \`go vet ./pkg/frost/...\` + \`gofmt -l ./pkg/frost/\` | clean | ## RFC-21 status With this PR, all four ROAST evidence categories are operational. M4 from the original PR #3866 review is **fully closed**. The keep-core code arc for RFC-21 is now feature-complete; remaining work is operations-side (integration testnet, manifest flip). ## Test plan - [ ] CI green. - [ ] Reviewer confirms the per-reason quota independence is the right semantics (alternative: single per-sender reject counter). - [ ] Reviewer confirms threshold = 1 for both reject and conflict (alternative: higher to absorb noise; trade-off is faster vs slower exclusion of misbehaving peers).
#3993) ## Why The RFC-21 Phase 6 review decided which orchestration errors are fallback-eligible (static config errors → safe to fall back to legacy retry path) and which must hard-fail (runtime per-attempt errors → no fallback, since per-participant divergence creates split-brain group fracture). The rationale lived in commit messages, the RFC text, and inline comments on individual sentinels — distributed enough that a future maintainer reading just \`roast_retry_orchestration.go\` could miss the load-bearing constraint. This PR adds a top-of-file design-rationale block that centralises the decision in the place that enforces it. ## What changed - One file changed: \`pkg/frost/signing/roast_retry_orchestration.go\` - Pure documentation: no behavior change, no test changes, no API change - 49 lines added (one comment block) ## What it captures 1. **STATIC vs RUNTIME classification** — explicit definitions, with the sentinel (\`ErrNoRoastRetryCoordinatorRegistered\`) and detection mechanism (\`errors.Is\` in \`signing_loop_roast_dispatcher.go\`) named. 2. **Why static-error fallback is safe** — every honest signer observes the same node-local config at startup, so the fallback decision is deterministic across the group. 3. **Why runtime-error fallback is unsafe** — per-attempt protocol state errors can be observed by some participants and not others within the same attempt; fallback would put some operators on new code and others on legacy for the same attempt. 4. **Enforcement rule** — any error surfaced from this package that is intended to permit fallback MUST be the sentinel; wrapping ANY runtime error in the sentinel is a safety regression that PR reviewers should reject. 5. **Historical redirect** — the earlier design had \`BeginAttempt\` failures fall back, on the assumption that BeginAttempt was cheap idempotent setup. Review identified that BeginAttempt mutates per-attempt state and can fail from races with concurrent receives; the taxonomy was tightened so only true configuration errors are fallback-eligible. ## Lineage Surfaced in the cross-PR review re-evaluation following PR #3866 follow-up landings. Originally tracked as "Document static-vs-runtime classification canonically" — initially flagged as "available if you want," now elevated because the rationale was the most important architectural decision in the RFC-21 stack and is currently the easiest piece of design context to lose. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Summary - add FROST WalletRegistry and FrostDkgValidator bindings plus config and chain attachment - implement v4 FROST DKG result digest assembly with full vs active member types and fixture-backed parity tests - add the native FROST DKG engine boundary, P2P round protocol, result signing, coordinator lifecycle, challenge monitoring, and wallet ID handling for x-only output keys ## Notes - Stacked on #3866 / `feat/frost-schnorr-migration-scaffold`. - Runtime DKG still requires the concrete native DKG engine registration from the frost-uniffi-sdk UDL/Rust export work. - The digest fixture now records the tBTC TypeScript generator source and regeneration command. A paired tBTC PR should still commit the mirror fixture at `docs/test-vectors/frost-dkg-result-digest-v1.json` and add the TS-side emitter/test; until then, the keep-core test verifies the pinned bytes and metadata but does not compare against a checked-in tBTC mirror file. ## Validation - `go test ./pkg/frost/registry ./pkg/chain/ethereum ./pkg/chain/ethereum/frost/gen/...` - `go test ./pkg/tbtc -run "TestFrostDKGSignatureThreshold|TestBoundedFrostDKGRecoveryStartBlock|TestFrostDKGRecoveryLookBackBlocks" -count=1` - `go test -tags "frost_native frost_tbtc_signer" ./pkg/tbtc -run "TestLowestLocalActiveMemberIndex|TestFrostMisbehavedMemberIndices|TestFrostDKGSignatureThreshold|TestBoundedFrostDKGRecoveryStartBlock|TestFrostDKGRecoveryLookBackBlocks" -count=1` - CI `client-build-test-publish` passes on the prior pushed commit; rerunning for the latest follow-up commit after push. ## Local Note - Full local `go test ./pkg/tbtc` currently fails in standalone `TestWatchCoordinationWindows`; this reproduces when run by itself and appears unrelated to the FROST DKG coordinator changes.
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
## Summary Stacked on #3866. This PR implements Taproot-native key-path wallet signing for the FROST migration path. It adds P2TR script handling, BIP341 SIGHASH_DEFAULT computation, BIP340 Schnorr signature verification, and single-element Taproot witness application in the Bitcoin transaction builder. The wallet transaction executor now routes all-P2TR transactions through the Schnorr/Taproot witness path. Mixed Taproot plus legacy inputs are rejected before signing, so this does not introduce a dual-signing model. ## Details - Add P2TR script helpers and x-only output key extraction. - Add Taproot key-path sighash generation without a repo-wide btcd upgrade. - Add `AddTaprootKeyPathSignatures` for 64-byte BIP340 signatures. - Preserve canonical 32-byte FROST signing messages when `big.Int` strips leading zero bytes. - Add builder and wallet tests covering all-P2TR signing and mixed-input rejection. ## Validation - `go test ./pkg/bitcoin ./pkg/tbtc` - `go test -tags=frost_native ./pkg/frost/signing`
## Summary - route tBTC operator recognition, authorization, and sortition-pool checks through the FROST wallet registry when FROST authorization is configured - use the FROST sortition pool for pool/reward/chaosnet status in FROST mode - update generated FROST bindings and DKG result conversion types for the current contract ABI ## Why The local FROST testnet run showed that allowlisted operators could be provisioned on-chain, but keep-core still needed to use the FROST authorization source and sortition pool instead of the legacy ECDSA/staking path for operator lifecycle checks. ## Validation - `go test ./pkg/chain/ethereum ./pkg/tbtc`
## Summary - register the build-tagged tbtc-signer bridge as the native FROST DKG engine and the full native FROST signing engine - add Go request/response adapters for the new interactive Rust ABI from #4011 - keep old/missing signer builds fail-closed through `ErrNativeCryptographyUnavailable` - disable legacy ECDSA DKG execution when the configured pre-parameters pool size is zero, so FROST-only local/testnet runs do not require an unused ECDSA preparams pipeline - reject Schnorr/FROST signing executors for non-Taproot transaction inputs before threshold signing begins - add registration coverage plus an optional linked-dylib smoke test that runs DKG, signing, aggregation, and BIP340 verification when the Rust symbols are present ## Stack - base: #4010 - requires Rust signer ABI: #4011 ## Testnet rehearsal note The local FROST testnet reached threshold signing for a real revealed Bitcoin testnet deposit and produced a native aggregate signature. The action then failed while applying the signature because the deposit sweep input was a legacy P2WSH deposit output requiring ECDSA semantics, not a Taproot key-path wallet input. This PR now fails closed earlier for that class of mismatch: a Schnorr/FROST signing executor will not sign non-P2TR transaction inputs. This does not make legacy deposits sweepable by FROST; it prevents wasting a threshold signing round and producing a misleading invalid-signature failure. ## Validation - `go test -tags 'frost_native frost_tbtc_signer' ./pkg/frost/signing` - `CGO_LDFLAGS='-L/private/tmp/keep-core-pr4005-worktree/pkg/tbtc/signer/target/debug -lfrost_tbtc' DYLD_LIBRARY_PATH=/private/tmp/keep-core-pr4005-worktree/pkg/tbtc/signer/target/debug go test -tags 'frost_native frost_tbtc_signer' ./pkg/frost/signing -run TestBuildTaggedTBTCSignerInteractiveFROSTBridge_WithLinkedSigner -count=1 -v` - `KEEP_CORE_FROST_TBTC_SIGNER_ACCEPT_SCAFFOLD_KEY_GROUP=true go test -tags 'frost_native frost_tbtc_signer' ./pkg/tbtc` - `go test ./pkg/tbtc -run 'TestWalletTransactionExecutor_SignTransaction_RejectsSchnorrForLegacyInputsBeforeSigning|TestWalletTransactionExecutor_SignTransaction_RejectsMixedTaprootAndLegacyInputsBeforeSigning|TestWalletTransactionExecutor_SignTransaction_AddsTaprootKeyPathWitness|TestDkgExecutor_DisablesECDSAPreParamsWhenPoolSizeZero'` - `go test -tags frost_native ./pkg/tbtc -run 'TestWalletTransactionExecutor_SignTransaction_RejectsSchnorrForLegacyInputsBeforeSigning|TestWalletTransactionExecutor_SignTransaction_RejectsMixedTaprootAndLegacyInputsBeforeSigning|TestWalletTransactionExecutor_SignTransaction_AddsTaprootKeyPathWitness|TestDkgExecutor_DisablesECDSAPreParamsWhenPoolSizeZero|TestCalculateWalletIDForSigner_FrostUniFFIV2UsesXOnlyOutputKey'` Note: without `KEEP_CORE_FROST_TBTC_SIGNER_ACCEPT_SCAFFOLD_KEY_GROUP=true`, the broad `pkg/tbtc` run fails on existing scaffold-era legacy private-key-share fixtures, not on this bridge registration change.
Three defensive fixes at the Go<->Rust signer FFI boundary in
pkg/frost/signing, so a malformed/oversized response or a panic from the native
signer fails one attempt instead of crashing the node or leaking secrets:
- parseBuildTaggedTBTCSignerResult: bounds-check the response buffer length
before the size_t -> C.int narrowing in C.GoBytes. A length >= 2^31 would
overflow to a negative value and panic ("length out of range") at the cgo
boundary, or silently truncate to a wrong length; reject it (the buffer is
still released by the deferred free).
- the request-call helper: scrub the secret request bytes from the C heap (the
C.CBytes copy) before C.free. The request can carry signing-share / nonce
material, and plain C.free does not overwrite; this mirrors the existing
Go-side zeroBytes hygiene applied to the caller's own copy.
- nativeExecutionFFIExecutorAdapter.Execute: recover panics raised along the
cgo signing path and surface them as a failed attempt, so a single malformed
native-signer response cannot take down the signing process. Adds a unit test
with a panicking primitive (verified to crash the process without the
recover).
The cgo paths are compile-verified under -tags 'frost_native frost_tbtc_signer
cgo'; the recover is exercised by an untagged unit test. The two cgo-tagged
changes still need a runtime pass in the cgo + linked-libfrost_tbtc
environment.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…usions only
NextAttempt's infeasibility check used the post-parking IncludedSet (which
subtracts transiently-parked and silenced members) to decide PERMANENT session
failure (ErrAttemptInfeasible). Silence-parking has no accuser-quorum gate, so a
single transient mass-silence event -- or one byzantine member that is the
elected coordinator for one attempt and omits snapshots -- could drop the
IncludedSet below threshold and permanently kill a signing session, even though
the original signer set could still complete it. This contradicts the file's own
step-4 contract ("silence parking is strictly transient; a falsely-silenced
honest peer recovers without intervention") and ErrAttemptInfeasible's own doc
("the session can no longer make progress with the original signer set"), and it
routes a permanent failure around the very accuser-quorum defense that exists to
stop a byzantine minority from grinding the group to ErrAttemptInfeasible.
Evaluate feasibility against the permanently-available set (original \\
ExcludedSet): only permanent exclusions (established reject/conflict/
equivocation) can render a session infeasible. The next attempt's IncludedSet is
still feasible \\ parkSet (it may fall below threshold for one attempt; the
parked members are reinstated next attempt). When parking would empty the
IncludedSet, reinstate the parked members rather than producing an
unconstructable empty attempt. Pathological grinding under sustained malicious
silence stays bounded by the outer tBTC signingAttemptsLimit.
The two existing tests asserted the buggy behavior (silence below threshold ->
permanent fail); retarget them to genuine permanent-exclusion infeasibility and
add transient-silence-recovers coverage (single-step, full harness, and a
two-attempt park-then-reinstate cycle).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…proot signTransaction unconditionally ran the native (Rust cgo) BuildTaprootTx parity/substitution path at the top of the function -- for EVERY transaction, including legacy ECDSA redemptions/sweeps/moving-funds. The native builder is a Taproot builder, so running it for a legacy (non-Taproot) transaction is meaningless, and a hard error from it (other than ErrNativeCryptographyUnavailable) would fail the signing of that legacy transaction. In the default build the native builder is a no-op, so this only bit the frost_native+cgo build, but it is a latent regression on the legacy signing path. Gate the native-build/substitution path on unsignedTx.HasOnlyTaprootKeyPathInputs() -- the same predicate that already governs FROST signing later in the function -- so the native Taproot builder runs only for all-Taproot transactions. Extract the path into maybeSubstituteNativeBuildTaprootTx. The substitution LOGIC (observational logging, divergence rejection, matching-IO acceptance) remains covered directly by the TestEvaluateNativeUnsignedTransactionForSigning_* tests. Replace the four legacy-P2PKH substitution-through-signTransaction integration tests (which asserted the now-removed behaviour) with two gate tests -- SkipsNativeBuildForLegacyTransaction and InvokesNativeBuildForTaprootTransaction -- and retarget the two native-build error-propagation tests to a Taproot transaction (the only path on which the native build now runs). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…package pkg/frost/retry/retry.go was a byte-for-byte copy of pkg/tecdsa/retry/retry.go -- two hand-synchronized copies of the security-critical retry participant-selection algorithm (EvaluateRetryParticipantsForSigning / ...ForKeyGeneration). A fix applied to one and not the other would silently make ECDSA and FROST select different qualified-operator sets for the same seed. This selection is structurally shared across schemes: it is the scheme-agnostic "pick the base signing group from the ready members" used for the initial attempt of every signing (and for DKG). FROST's per-attempt robustness diverges only on retries, via ROAST (NextAttempt); the initial selection does not -- ROAST is a transition-from-previous function and cannot produce attempt 0, so the initial selection is permanently shared. Move the algorithm (byte-for-byte, verified identical) into a neutral package, pkg/protocol/retry, and point both callers (pkg/tbtc/dkg_loop.go and signing_loop_legacy_selector.go -- the only two) at it. Delete pkg/tecdsa/retry and pkg/frost/retry. Keep the superset test (the tecdsa copy had an extra triplet-seat-count case the frost copy lacked). No behavior change; a single source eliminates the drift hazard. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…d-set acceptance When a seat enters lost-sync from a transition bundle for an attempt it never observed, the triggering bundle is operator-authenticated -- log the sending seat and the claimed attempt-context hash once per lost-sync episode, so the operational runbook can attribute and remove/slash a member spamming bogus-attempt bundles. (This is an authenticated-insider liveness halt the blame bridge does NOT close, because a never-observed attempt yields no evidence to attribute it.) markLostSync now uses CompareAndSwap and returns whether it transitioned, so the attribution is logged exactly once even while the listener keeps receiving such bundles. No change to the fail-closed semantics. Also document that this residual is accepted under the PERMISSIONED operator set (attributable + liveness-only + governance-removable) and MUST be revisited before any move to a permissionless set, where the f+1 snapshot-corroboration / resync fix would be warranted instead of accepting it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
During the ECDSA->FROST migration an operator is a member of BOTH sortition pools at once: existing ECDSA wallets keep draining via redemptions while FROST is live. The seven sortition.Chain methods on TbtcChain switch on hasFrostAuthorization() (true whenever FROST is configured), and a single MonitorPool loop consumed them -- so the loop labeled "legacy ECDSA sortition pool monitoring" actually maintained the FROST pool during overlap, and post-cutover (DisableLegacyECDSA=true) the only loop stopped, leaving the FROST pool -- the one new FROST wallet DKG selects from -- unmonitored. Bind monitoring explicitly per pool. Two sortition.Chain views (ecdsaSortitionChain, frostSortitionChain) route directly to their own registry/pool with no hasFrostAuthorization() switch, and the node runs one MonitorPool loop per pool: - ECDSA loop: existing flags + policy, now correctly ECDSA-bound. - FROST loop: new DisableFrostSortitionPoolMonitoring flag (default-on), beta policy only (the ECDSA pre-params gate does not apply to FROST DKG), gated on FROST being configured and INDEPENDENT of DisableLegacyECDSA so the FROST pool stays monitored during the drain and after the legacy pool is retired. The operator is not necessarily registered in both pools, so the FROST loop treats sortition.ErrOperatorUnknown (now exported) as non-fatal: it warns and leaves FROST monitoring inactive rather than aborting node startup. The legacy loop keeps its existing fail-fast (the operator is ECDSA-registered during the drain). TbtcChain's own sortition.Chain methods are left unchanged, so heartbeat and other callers are unaffected; GetOperatorID stays ECDSA-bound. Both loops' join/update txs already share tc.transactionMutex + tc.nonceManager, so they cannot race on the operator account nonce. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…uted DKG (#4110) Stacks on the FROST/ROAST readiness branch (#3866). Adds two separate-process ("shape B") real-crypto e2es over **real libp2p**, complementing the in-process shape-A multinode test. Each re-execs the test binary as N worker processes that each link `libfrost_tbtc`. ## What's added - **shape-B** (`roast_shapeb_libp2p_multiproc_e2e_…`): dealer DKG run once in a bootstrap subprocess, the encrypted key group copied into each worker's own state dir; every worker drives the **ROAST interactive-signing runner** over real libp2p and independently aggregates the same BIP-340 signature (n winners, vs the shared-engine shape-A's one). Covers per-node engine/state isolation + the libp2p outer framing for the runner + ROAST + transport seam. - **distributed-DKG** (`roast_distributed_dkg_libp2p_multiproc_e2e_…`): every worker runs the **real distributed FROST DKG** (`part1/2/3`) over libp2p, with round-2 per-recipient secret shares **sealed via secp256k1 ECDH + AES-256-GCM** (cleartext round-2 over a broadcast bus would let any node sum `f_i(j)` and reconstruct a peer's share), so each node holds **only its own key package**; then threshold-signs via the stateless low-level path. Closes the dealer-DKG key-custody gap end to end. ## Verification Both green standalone (stable across repeated runs) and under the full cgo gate, linking `libfrost_tbtc` built from the pinned signer ref (`ci/frost-signer-pin.env` = `6e3718ba0`, unchanged): ``` CGO_ENABLED=1 KEEP_CORE_FROST_REQUIRE_CGO=true \ go test -tags "frost_native frost_tbtc_signer" -run TestRealCgoInteractiveSigning ./pkg/frost/signing/ ``` All 6 real-crypto tests pass together on this branch (4 shape-A + shape-B + distributed-DKG). ## CI The `frost-cgo-integration` gate already exercises both new tests — its `-run 'TestRealCgoInteractiveSigning'` matches them by prefix, and the pinned crate already exports the `dkg_part1/2/3` + low-level sign FFI they use. No workflow change needed. (Heads-up: these are multi-process + gossipsub tests; robust locally via retransmission + warmup, but mesh-convergence time varies — if shape-B ever flakes on a slow runner, move it to a non-blocking job.) ## Note Building the distributed-DKG test confirmed `dkg_part1/2/3` interoperate over the transport — but the node's DKG path (`executeTBTCSignerFROSTDKG`) still uses the dealer `RunDKGWithSeed`. Wiring distributed DKG into the node remains the readiness-gate-blocking work. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…sient parking After the transient-silence feasibility fix, NextAttempt's infeasibility test is on the non-excluded (feasible) set, not the post-parking IncludedSet: a next IncludedSet can fall below threshold due to transient parking without returning ErrAttemptInfeasible (parked members reinstate next attempt). Update the ErrAttemptInfeasible doc + the NextAttempt step-6 contract (and the now-misleading 'included set below threshold' error string) to match. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lags #4120 added Config.Tbtc.DisableFrostSortitionPoolMonitoring (default-on FROST monitoring) but initTbtcFlags did not bind a CLI flag for it, unlike the legacy opt-out -- so operators starting the node via CLI flags could not exercise the advertised opt-out. Bind --tbtc.disableFrostSortitionPoolMonitoring and cover it in the flags test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ror type
GetWallet derived a legacy wallet ID on ANY error from the canonical walletID
accessor. For a FROST wallet on a canonical Bridge, a transient call failure
would silently yield the left-padded legacy ID, and callers use
WalletChainData.WalletID to choose P2TR (FROST) vs P2WPKH (legacy) scripts, so it
would build or search the wrong wallet script.
The error type cannot reliably tell a legacy on-chain Bridge -- where the walletID
eth_call returns a normal RPC/ABI error even with the current binding -- from a
transient failure, so distinguishing by error is fragile and breaks legacy
deployments (Codex P1 on the first revision of this PR). Route the fallback by
SCHEME instead, using the wallet's ECDSA wallet ID (zero => FROST, non-zero =>
legacy ECDSA), which GetWallet already has:
- Legacy ECDSA wallet: its canonical wallet ID equals the legacy derivation, so
fall back on any accessor error (and it is the only option on a legacy Bridge
lacking the accessor).
- FROST wallet: requires the canonical ID; surface the error rather than return
a wrong legacy ID. A FROST wallet only exists on a canonical-ID Bridge, so the
error is genuinely transient.
Extracted into resolveWalletID with TestResolveWalletID covering all four cases.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…fi-boundary-hardening
…buildtx-gate-taproot
…etry-selection-shared-package
…ostsync-attribution-logging
## Summary Follow-up to #3866 (Go FROST/ROAST coordinator). Three defensive fixes at the cgo Go↔Rust signer FFI boundary in `pkg/frost/signing`, so a malformed/oversized response — or a panic — from the native signer fails a single attempt instead of crashing the node or leaking secrets. ## Fixes 1. **Bounds-check the response length before `C.GoBytes`** (`parseBuildTaggedTBTCSignerResult`). The Rust-supplied `buffer.len` (`C.size_t`) was narrowed to `C.int` with no check: a length `≥ 2³¹` overflows to a negative value → `C.GoBytes` panics (`length out of range`) at the cgo boundary, or silently truncates to a wrong length. Now rejected with a clear error (the buffer is still freed by the deferred free). 2. **Zeroize the secret request bytes on the C heap before `C.free`** (the request-call helper). `C.CBytes(requestPayload)` copies the request (which can carry signing-share / nonce material) to the C heap; plain `C.free` does not overwrite, so the secret lingered in freed memory. Now scrubbed via the existing `zeroBytes`, mirroring the Go-side hygiene already applied to the caller's own copy. 3. **`recover()` at the FFI boundary** (`nativeExecutionFFIExecutorAdapter.Execute`). A panic anywhere along the cgo signing path (e.g. fix #1's overflow panic, or a nil-deref decoding a malformed engine response) previously took down the whole signing process. It's now converted to a failed attempt the outer tBTC `signingRetryLoop` handles cleanly. ## Tests / verification - `TestNativeExecutionFFIExecutorAdapter_Execute_RecoversCgoBoundaryPanic` — a panicking primitive; verified to **crash the process without the recover** and pass with it. Full untagged `pkg/frost/signing` suite stays green. - The cgo-tagged changes (#1, #2) are **compile-verified** under `-tags 'frost_native frost_tbtc_signer cgo'` (build + `go vet`). They still need a **runtime pass in the cgo + linked-`libfrost_tbtc` environment** — I can't exercise the real FFI here. Addresses the cgo-boundary cluster from the review of #3866 (length-narrowing crash, secret-in-C-heap, missing recover). _Found during review of #3866._ 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…usions only (#4116) ## Summary Follow-up to #3866 (Go FROST/ROAST coordinator). Fixes a robustness bug in the ROAST `NextAttempt` policy where a **transient** event could **permanently** kill a signing session. `computeNextAttempt`'s infeasibility check used the post-parking `IncludedSet` — which subtracts transiently-parked *and silenced* members — to decide permanent session failure (`ErrAttemptInfeasible`). Silence-parking deliberately has **no** accuser-quorum gate (it's meant to be strictly transient), so a single transient mass-silence event — or one byzantine member that is the elected coordinator for one attempt and omits snapshots — could drop the `IncludedSet` below threshold and permanently fail the session, even though the original signer set could still complete it. This contradicts the file's own design contract and hands a *single* byzantine member (elected coordinator for one attempt) the power to permanently kill a session — exactly the grind-to-`ErrAttemptInfeasible` outcome the accuser-quorum machinery exists to prevent, routed around via the ungated silence path: - **step 4:** "Silence parking (strictly transient)… the attempt after that automatically reinstates them, so a falsely-silenced honest peer recovers without intervention." - **`ErrAttemptInfeasible` doc:** returned when "the session can no longer make progress **with the original signer set**." ## Fix Evaluate feasibility against the **permanently-available set** (`original \ ExcludedSet`): only permanent exclusions (established reject / conflict / coordinator-equivocation) can render a session infeasible. The next attempt's `IncludedSet` is still `feasible \ parkSet` — it may fall below threshold for one attempt, but the parked members are reinstated next attempt (burning one attempt instead of failing the session). When parking would *empty* the `IncludedSet`, the parked members are reinstated now rather than producing an unconstructable empty attempt. Pathological grinding under *sustained* malicious silence stays bounded by the outer tBTC `signingAttemptsLimit` — the inner chain no longer needs a permanent-fail to stop grinding. ## Tests The two existing tests asserted the buggy behavior (silence below threshold → permanent fail, using a degenerate n-of-n config). Retargeted to genuine infeasibility and added recovery coverage: - `TestNextAttempt_InfeasibilityWhenPermanentExclusionsBelowThreshold` — permanent exclusion below threshold still fails (correctly). - `TestSoak_TransientSilenceBelowThresholdRecovers` — full multi-coordinator harness: silence does **not** fail; silenced members are parked, not excluded. - `TestNextAttempt_TransientSilenceBelowThresholdDoesNotPermanentlyFail` — unit: silence parks (transient), never excludes (permanent). - `TestNextAttempt_TransientSilenceRecoversAcrossTwoAttempts` — end-to-end: silence → park → reinstate, included set returns to full. The "original signer set preserved" invariant (`|Inc|+|Exc|+|Park|` = original size) holds in both branches. `gofmt`, `go vet`, and the full `pkg/frost/roast` suite pass; builds clean untagged and under `-tags 'frost_native frost_tbtc_signer cgo frost_roast_retry'`. _Found during review of #3866._ 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…proot (#4117) ## Summary Follow-up to #3866 (Go FROST/ROAST coordinator). Gates the native (Rust cgo) `BuildTaprootTx` parity/substitution path so it runs only for Taproot transactions. `walletTransactionExecutor.signTransaction` ran the native `BuildTaprootTx` path **unconditionally**, at the very top, for **every** transaction a wallet signs — including legacy ECDSA redemptions / sweeps / moving-funds — before any scheme/Taproot check. The native builder is a Taproot builder, so running it for a legacy (non-Taproot) transaction is meaningless, and a hard error from it (other than `ErrNativeCryptographyUnavailable`) would fail the signing of that legacy transaction. In the default build the native builder is a no-op (`("", nil)`), so this only bit the `frost_native+cgo` build — but it's a latent regression on the legacy signing path. ## Fix Gate the native-build/substitution path on `unsignedTx.HasOnlyTaprootKeyPathInputs()` — the **same predicate that already governs FROST signing** later in the same function (`"cannot apply FROST signatures to non-taproot transaction inputs"`). The native Taproot builder now runs only for all-Taproot transactions; legacy transactions skip it. The path is extracted into `maybeSubstituteNativeBuildTaprootTx`. > Gate signal chosen per a Codex second opinion: the tx shape (`HasOnlyTaprootKeyPathInputs`), not the wallet scheme — "the native builder's applicability is about the unsigned tx it is asked to build." ## Tests The four legacy-P2PKH substitution-through-`signTransaction` integration tests asserted the now-removed behaviour (substitution for a legacy tx). They are replaced with: - `SkipsNativeBuildForLegacyTransaction` — a legacy (P2PKH) tx: the native build is **not** invoked even with substitution enabled, and the tx signs via the Go path. (Verified to fail if the gate is removed.) - `SubstitutesNativeBuildForTaprootTransaction` — an all-Taproot tx: the native build **is** invoked, and a matching native tx is substituted (`ReplaceUnsignedTransaction` + the substitution info log) and signed with a Taproot witness. The two native-build error-propagation tests are retargeted to a Taproot tx (the only path on which the native build now runs), via a shared `buildTaprootKeyPathUnsignedTxForTest` helper. The substitution **logic** remains covered directly by the `TestEvaluateNativeUnsignedTransactionForSigning_*` tests. gofmt + `go vet` clean; full untagged `pkg/tbtc` suite passes; builds clean under `-tags 'frost_native frost_tbtc_signer cgo'`. _Found during review of #3866._ 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…package (#4118) ## Summary Follow-up to #3866. Eliminates a byte-for-byte duplicate of the retry participant-selection algorithm by moving it into a shared, scheme-neutral package. `pkg/frost/retry/retry.go` was a **byte-for-byte copy** of `pkg/tecdsa/retry/retry.go` — two hand-synchronized copies of the security-critical retry participant-selection algorithm (`EvaluateRetryParticipantsForSigning` / `…ForKeyGeneration`). A fix applied to one and not the other would silently make ECDSA and FROST select **different** qualified-operator sets for the same seed. ## Why a shared package (vs. one importing the other) This selection is **structurally shared** across schemes: it's the scheme-agnostic "pick the base signing group from the ready members" used for the **initial attempt of every signing** (and for DKG). FROST's per-attempt robustness diverges only on **retries**, via ROAST (`NextAttempt`) — the initial selection does not. ROAST is a transition-from-previous-attempt function and *cannot* produce attempt 0, so `roastSigningParticipantSelector` falls back to this selection for the initial attempt (`ConsumeRoastTransitionForSelection` returns `ErrRoastSelectionFallBackToLegacy` at `roastAttemptNumber == 0`). So the initial selection is **permanently shared** — neither `tecdsa` nor `frost` should own it. ## Change - Move the algorithm **byte-for-byte** (verified `diff`-identical, no logic change) into a new neutral package `pkg/protocol/retry` (package `retry`, alongside `pkg/protocol/group`). - Repoint the **only two** callers — `pkg/tbtc/dkg_loop.go` (DKG) and `pkg/tbtc/signing_loop_legacy_selector.go` (FROST initial selection) — at it (import-path change only; usage `retry.X` unchanged). - Delete `pkg/tecdsa/retry` and `pkg/frost/retry`. - Keep the **superset** test (the `tecdsa` copy had a `TestExcludeOperatorTripletsCountsRightOperatorSeats` case the `frost` copy lacked). - Update three doc comments that referenced the old path. ## Verification - `diff` confirms the moved algorithm is byte-identical to the original (pure move). - `go build ./...` clean (no dangling references to the deleted packages); builds clean under `-tags 'frost_native frost_tbtc_signer cgo frost_roast_retry'`. - gofmt + `go vet` clean; the moved `pkg/protocol/retry` test passes (incl. the superset case); the full `pkg/tbtc` suite passes. No behavior change — a single source eliminates the silent-drift hazard. _Found during review of #3866._ 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…d-set acceptance (#4119) ## Summary Follow-up to #3866. A small, behavior-preserving hardening + decision record for the ROAST transition-exchange lost-sync path (review finding #2). **Background.** In `onBundle`, a seat that receives a transition bundle for an attempt it **never observed** trips `markLostSync()` — failing the wallet's signing retry loop closed — *before* any verification (a behind-seat lacks the observe handle full `VerifyBundle` needs). So an authenticated group member can broadcast a structurally-valid bundle with a bogus, never-committed attempt hash and halt every honest seat's signing. The blame bridge (PR2b-2) does **not** close this: a never-committed attempt produces no evidence to attribute the sender. **Decision (after Codex review + threat analysis).** Under the **permissioned** operator set this residual is **accepted**: it's liveness-only (fail-closed — never an unsafe/divergent signature), the triggering seat is operator-authenticated (so attribution is immediate), and a misbehaving operator is governance-removable + economically deterred. The proper fix (f+1 snapshot-corroboration, or a resync state) is a real ROAST protocol change whose simple form can fracture the group on legitimate *sparse-failure* bundles — disproportionate while the set stays permissioned. ## Change - **Attribution logging.** When a seat enters lost-sync from an unobserved-attempt bundle, log the **sending seat** and the **claimed attempt-context hash** — once per lost-sync episode — so the operational runbook can identify and remove/slash a member spamming bogus-attempt bundles. - `markLostSync` now uses `CompareAndSwap` and returns whether it transitioned, so the attribution is logged exactly once even while the listener keeps receiving such bundles. **No change to the fail-closed semantics.** - **Decision record in-code:** documents the permissioned-set acceptance and marks it as a **hard item to revisit before any move to a permissionless operator set**, where an anonymous/costless/non-attributable DoS would warrant the corroboration/resync fix. ## Verification `gofmt` + `go vet` clean; builds under `-tags 'frost_native frost_roast_retry cgo frost_tbtc_signer'`; the transition-exchange / lost-sync / bundle tests pass. The only caller of `markLostSync` is the updated site. _Found during review of #3866._ 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…4120) ## Summary Follow-up to #3866 (review finding #3). Fixes sortition-pool monitoring for the dual-pool window of the ECDSA→FROST migration. During the migration an operator is a member of **both** sortition pools at once: existing ECDSA wallets keep draining via redemptions while FROST is live. The seven `sortition.Chain` methods on `TbtcChain` switch on `hasFrostAuthorization()` (true whenever FROST is configured), and a **single** `MonitorPool` loop consumed them — so: - **During overlap** (`DisableLegacyECDSA=false`): the loop labeled *"legacy ECDSA sortition pool monitoring"* actually maintained the **FROST** pool; the ECDSA pool got no maintenance. - **Post-cutover** (`DisableLegacyECDSA=true`): the only loop stopped entirely, leaving the **FROST** pool — the one new FROST wallet DKG selects from — **unmonitored**. > The ECDSA drain itself is unaffected either way: signing an existing wallet uses the locally-held key share + the wallet's fixed roster, never sortition-pool state. This is a pool-membership/selection-eligibility fix, not a fund-availability one. ## Change Bind monitoring **explicitly per pool**. Two `sortition.Chain` views (`ecdsaSortitionChain`, `frostSortitionChain`) route directly to their own registry/pool with **no** `hasFrostAuthorization()` switch, and the node runs **one `MonitorPool` loop per pool**: - **ECDSA loop** — existing flags + policy, now correctly ECDSA-bound (data path *and* beta policy read the ECDSA pool). - **FROST loop** — new `DisableFrostSortitionPoolMonitoring` flag (**default-on**), beta policy only (the ECDSA pre-params gate doesn't apply to FROST DKG), gated on FROST being configured and **independent of `DisableLegacyECDSA`** so the FROST pool stays monitored during the drain *and* after the legacy pool is retired. The operator isn't necessarily registered in both pools, so the FROST loop treats `sortition.ErrOperatorUnknown` (now exported) as **non-fatal** — it warns and leaves FROST monitoring inactive rather than aborting startup. The legacy loop keeps its existing fail-fast (the operator is ECDSA-registered during the drain). `TbtcChain`'s own `sortition.Chain` methods are **left unchanged**, so heartbeat and other callers are unaffected; `GetOperatorID` stays ECDSA-bound. Both loops' join/update txs already share `tc.transactionMutex` + `tc.nonceManager`, so they cannot race on the operator account nonce. ### Design input baked in (per maintainer + Codex) - FROST loop join policy = beta-operator only (no FROST equivalent of the ECDSA pre-params gate; FROST DKG readiness is announced separately). - Operator compensation is out-of-band, so ECDSA-pool staleness during the drain is benign; the FROST loop is the load-bearing one. - `GetOperatorID` asymmetry preserved (separate `GetFrostOperatorID` exists). ## Known limitation `MonitorPool` (shared with the beacon) hard-returns at startup without starting its ticker, so an operator that registers for FROST *after* node start needs a restart to begin FROST monitoring. Logged clearly; deliberately not changing the shared `MonitorPool` here. ## Tests / verification - New `tbtc_sortition_chain_views_test.go` pins each view to its intended pool (legacy→ECDSA, FROST→FROST, unconfigured→nil) — **negative-checked** (mis-binding the legacy view to the FROST pool makes it fail). - gofmt + `go vet` clean; untagged `go build ./...` clean; builds under `-tags 'frost_native frost_tbtc_signer cgo frost_roast_retry'`; `pkg/sortition` + `pkg/chain/ethereum` tests pass; full `pkg/tbtc` suite passes. - Adversarial review (3 angles → verify): 0 confirmed findings; adapter fidelity confirmed line-for-line across all 13 methods; cross-loop nonce safety confirmed. _Found during review of #3866._ 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…ror type (#4122) ## Summary Addresses a Codex finding (relayed via the #4119 review): `TbtcChain.GetWallet` derived a **legacy** wallet ID on **any** error from the canonical `walletID` accessor. For a FROST wallet on a canonical Bridge, a transient call failure would silently yield the left-padded legacy ID — and callers use `WalletChainData.WalletID` to choose **P2TR (FROST)** vs **P2WPKH (legacy)** scripts, so the node would build or search the **wrong wallet script**. ## Why route by scheme (revised after Codex P1) The first revision distinguished by error type (a sentinel for the missing accessor, surface everything else). Codex correctly flagged a **P1 regression**: a *legacy on-chain Bridge* built with the *current* generated binding still satisfies the accessor interface, so its missing `walletID` function returns a normal RPC/ABI error — not the sentinel — and that revision would surface it and **break `GetWallet` on exactly the legacy deployments the fallback exists for**. Error type cannot reliably separate "function absent on-chain" from "transient." So this routes by **scheme**, using the wallet's `EcdsaWalletID` (which `GetWallet` already reads, and which the codebase already uses to infer scheme — zero ⇒ FROST): - **Legacy ECDSA wallet** (`EcdsaWalletID != 0`): its canonical wallet ID *equals* its legacy derivation, so fall back on **any** accessor error — and it's the only option on a legacy Bridge lacking the accessor. - **FROST wallet** (`EcdsaWalletID == 0`): requires the canonical ID; **surface** the error rather than return a wrong legacy ID. A FROST wallet only exists on a canonical-ID Bridge, so such an error is genuinely transient. Logic is extracted into `resolveWalletID(bridge, walletPublicKeyHash, ecdsaWalletID)`. ## Tests `TestResolveWalletID` covers all four cases: accessor success → canonical; FROST + accessor error → surfaced; **legacy + accessor error → legacy fallback** (the P1 regression guard — verified to fail if the routing surfaces errors for legacy wallets); legacy + missing-accessor binding → legacy fallback. gofmt + `go vet` clean; full `pkg/chain/ethereum` suite passes. _Found during the Codex review batch on #4115–#4120; revised per the Codex P1 re-review._ Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Current State (as of 2026-05-17)
This draft PR is the umbrella readiness branch for
feat/frost-schnorr-migration-scaffold.It is being kept current with
mainso it can become a direct merge target if the FROST/ROAST stack is approved for activation.It remains in draft until the remaining phase-gate, governance, and cross-repository readiness items are closed.
Canonical Status Sources
docs/frost-migration/external-repository-tracking.md(intlabs-xyz/tbtc)docs/reviews/frost-roast-production-readiness-2026-05-16.md(intlabs-xyz/tbtc)Latest Refresh
maininto this branch.frost_native.Remaining Cross-Repo Closure Items
Notes