feat(tbtc/signer): FROST/ROAST Rust signer by mswilkison · Pull Request #4005 · threshold-network/keep-core

mswilkison · 2026-05-26T15:46:18Z

Summary

Lands the Rust FROST/ROAST signer at pkg/tbtc/signer/ alongside the existing Go DKG coordinator. The initial import was extracted from tlabs-xyz/tbtc:feat/frost-schnorr-migration at frozen signed tag frost-extraction-source-v1, then this PR applied targeted canonical-path, CI, and reviewer hardening follow-ups in the canonical keep-core repo.

This is no longer a pure byte-for-byte mirror PR. The post-freeze changes are intentional canonical keep-core changes (canonical-path normalization, CI wiring, and reviewer hardening). After this PR lands, threshold-network/keep-core is the source of truth for the signer; the extracted monorepo tag is provenance for the initial import only.

Source-PR Provenance

Source repository: tlabs-xyz/tbtc
Source branch: feat/frost-schnorr-migration
Frozen source tag: frost-extraction-source-v1 (signed annotated tag by maclane)
Frozen source commit: 52389bd5cccb5daeef195671feb7ca46be6e2f37
Initial frozen manifest: https://github.com/tlabs-xyz/tbtc/blob/frost-extraction-source-v1/extraction/frost-extraction-source-manifest.json

The monorepo provenance above is used only to audit the initial import. Ongoing signer development, fixes, CI, and reviews happen in threshold-network/keep-core.

Why This Lives In keep-core

Per extraction plan v38 section 3.1, the signer co-locates with keep-core because:

HSM enforcement is external to signer code via standard PKCS#11/KMIP and does not force a separate audit lifecycle.
Coordinator coupling dominates; the B-2 Go DKG coordinator already lives in this repo via PR FROST/ROAST readiness branch #3866.
TEE adoption later is reversible without forcing a repo split.

Layout Introduced

pkg/tbtc/signer/
├── Cargo.toml + Cargo.lock + build.sh
├── src/
├── tests/
├── benches/
├── include/
├── docs/
│   └── formal/models/
├── scripts/formal/
└── test/vectors/

CI Coverage

This PR now runs:

cargo fmt --manifest-path pkg/tbtc/signer/Cargo.toml -- --check
cargo clippy --manifest-path pkg/tbtc/signer/Cargo.toml --all-targets -- -D warnings
cargo test --manifest-path pkg/tbtc/signer/Cargo.toml
cargo test --manifest-path pkg/tbtc/signer/Cargo.toml formal_verification_
pkg/tbtc/signer/scripts/formal/run_tla_models.sh

Verification

Latest local verification after review fixes:

cargo fmt --manifest-path pkg/tbtc/signer/Cargo.toml -- --check
cargo clippy --manifest-path pkg/tbtc/signer/Cargo.toml --all-targets -- -D warnings
TBTC_SIGNER_STATE_PATH=/tmp/tbtc-signer-ci-state-local-4005.json cargo test --manifest-path pkg/tbtc/signer/Cargo.toml
cargo test --manifest-path pkg/tbtc/signer/Cargo.toml formal_verification_

Latest GitHub CI on this branch is green, including Signer Rust checks, Signer formal invariants, and TLA model checks.

Relationship To PR #3866

PR #3866 lands the Go-side DKG coordinator. This PR lands the Rust-side signer. They are complementary protocol implementations:

DKG: keep-core Go side coordinates; signer participates; DKG result digest is computed on both sides and checked for parity.
Signing: signer produces FROST signature shares; coordinator aggregates.

PR #3866 and this PR can land independently in either order.

Plan Context

This is one of the mergeable mirror PRs comprising the FROST extraction, alongside tbtc-v2, tbtc-subgraph, and tbtc-v3-indexer mirror PRs.

Lands the Rust signer at pkg/tbtc/signer/ alongside the existing Go DKG coordinator. Mirrors the signer slice of tlabs-xyz/tbtc:feat/frost-schnorr-migration (PR #10) at frozen tag frost-extraction-source-v1. Per extraction plan v38 §3.1, the signer co-locates with keep-core because: (a) HSM enforcement is external to signer code (standard PKCS#11/KMIP client), doesn't force a separate audit lifecycle; (b) coordinator coupling dominates (B-2 DKG coordinator already lives in this repo via PR #3866); (c) TEE adoption later is reversible without forcing a repo split. Layout - pkg/tbtc/signer/ — Rust crate with own Cargo.toml - pkg/tbtc/signer/docs/ — signer + ROAST + TEE specs - pkg/tbtc/signer/docs/formal/models/ — ROAST + TEE TLA+ models - pkg/tbtc/signer/scripts/formal/ — ROAST vector + TLA runner - pkg/tbtc/signer/test/vectors/ — roast-attempt-context-v1.json Files (49 total) - 47 mirror status (Rust source, signer docs, ROAST docs, TLA models, test vector, etc.) - 2 allowlisted-divergence: - pkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjs (path normalization to signer-repo paths) - pkg/tbtc/signer/scripts/formal/run_tla_models.sh (MODELS_PATH env var refactor; default pkg/tbtc/signer/docs/formal/models/) Provenance - Source repository: tlabs-xyz/tbtc - Source branch: feat/frost-schnorr-migration - Source tag (frozen): frost-extraction-source-v1 - Source commit (H): 52389bd5cccb5daeef195671feb7ca46be6e2f37 - Source manifest: extraction/frost-extraction-source-manifest.json (manifestSha256: f7295fb738104501eb6c0c2447a42122ceb5f684c7a7c5dfb50ecb0bde3a0ea0) - Source PR(s): #425 (tbtc-signer error codes) + the full FROST migration series; complete list per source manifest's commit range over tools/tbtc-signer/ and signer-adjacent docs/scripts. Build wiring (follow-up) This PR lands the signer source code, docs, vectors, and scripts. Rust toolchain CI integration (cargo build/test/clippy/fmt as a separate CI job alongside the existing Go jobs) is a follow-up — Go workspace ignores non-Go subdirectories automatically; the cargo crate at pkg/tbtc/signer/Cargo.toml is independently buildable. Known TBD (resolved pre-merge) - 2 allowlisted-divergence files have expectedTargetSha256 = <TBD> in the source manifest. Path normalization transformations need to be applied to make the scripts run in pkg/tbtc/signer/ context; this PR ships the source verbatim, follow-up commits on this branch apply the transformations before merge. - Dual signoff required on each allowlisted-divergence entry per plan v38 §4.2 (extraction lead + canonical repo maintainer). Verification (pre-merge per plan v38 §7.2) - 47 mirror files: sha256 equality between git show frost-extraction-source-v1:<sourcePath> and this PR's content at pkg/tbtc/signer/<targetPath>. - 2 allowlisted-divergence files: sha256 == expectedTargetSha256 (recorded post-transformation in source manifest). - PR-scoped rogue-file check: git diff --name-only main...HEAD only contains files in manifest's fileMap with targetKey "signer". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-26T15:47:12Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR introduces pkg/tbtc/signer, a Rust tbtc-signer bootstrap crate with a C ABI JSON-over-buffer interface, FFI translation helpers, API types, error semantics, deterministic coordinator selection, a comprehensive lib.rs FFI surface with tests, an admission-checker CLI, Criterion benchmarks, TLA+ formal models, conformance vectors/checkers, and extensive design/runbook documentation.

Changes

Core Signer Implementation

Layer / File(s)	Summary
FFI & API contract definition `include/frost_tbtc.h`, `src/api.rs`	C header declares `TbtcBuffer` and `TbtcSignerResult` structs plus ~18 `frost_tbtc_*` function signatures; Rust API types define serde-serialized request/response payloads for DKG, sign-round lifecycle, transcript audit, quarantine, refresh, emergency rekey, differential fuzzing, and canary rollout.
Error classification and recovery semantics `src/errors.rs`	`EngineError` enum with thiserror integration; `code()` maps variants to stable string identifiers; `recovery_class()` classifies errors as `"recoverable"` or `"terminal"` with explicit handling for consumed attempt/round replays.
FFI translation layer `src/ffi.rs`	C-compatible buffer structs, JSON serialization/deserialization helpers, `ffi_entry` for panic containment via `catch_unwind`, and safe `free_buffer` memory management for the FFI boundary.
Deterministic coordinator selection `src/go_math_rand.rs`	Rust port of Go's `math/rand` with fixed cooked constants table, `seedrand`, bounded sampling, and shuffle; `select_coordinator_identifier` returns deterministic coordinators matching keep-core outputs per test vectors.
Lib.rs FFI endpoints & bootstrap gating `src/lib.rs`	Module declarations, signer version, bootstrap-mode environment/profile parsing with strict production gating, test-only override, ~18 `#[no_mangle] extern "C"` endpoints (DKG, sign-round, transcript audit, emergency rekey, canary rollout, refresh-shares), and comprehensive test suite covering idempotency, lifecycle transitions, Taproot validation, and emergency/rollout semantics.

Admission Checking & Policy Enforcement

Layer / File(s)	Summary
Admission checker binary `src/bin/admission_checker.rs`	JSON-configured operator admission evaluator with policy constraints (per-provider/region caps, custody/attestation/SLA requirements), optional DAO override verification (Schnorr signature validation over SHA-256 payload digest), and replay registry persistence to prevent re-use of override decisions.

Benchmarking & Formal Test Vectors

Layer / File(s)	Summary
Phase 5 ROAST benchmark suite `benches/phase5_roast.rs`	Criterion benchmarks exercising DKG, start/finalize sign-round, state reload/recovery, and attempt transitions (timeout/invalid-share/stale-replay); includes FFI helpers, hashing/fingerprinting utilities, deterministic coordinator probing, and error validation.
Taproot signature fraud formal verification `tests/p2tr_signature_fraud_vectors.rs`	Integration test loading p2tr-signature-fraud-v0.json vectors, computing BIP-341 key-path sighashes, verifying Schnorr signatures, deriving dual challenge identities, and asserting mutation-based identity uniqueness.
ROAST attempt-context test vectors `test/vectors/roast-attempt-context-v1.json`	JSON schema and fixtures for deterministic fingerprint/attempt-id computation across multiple participant sets and edge cases, used by formal verification checkers.

TLA+ Formal Verification Models

Layer / File(s)	Summary
ROAST attempt state machine model `docs/formal/models/RoastAttemptStateMachine.tla`, `.cfg`	TLA+ module defining attempt numbering, active participants, deterministic coordinator, consumed-attempt registries (transient and durable), transitions (advance/restart/reload), and safety invariants for monotonicity and replay safety.
ROAST rollout policy state machine `docs/formal/models/RoastRolloutPolicy.tla`, `.cfg`	TLA+ module specifying five-stage rollout progression (bootstrap→canary→broad→rollback↔halted) with control signals and temporal properties governing promotion/hold/rollback and emergency stop.
State-key provider policy & production gating model `docs/formal/models/StateKeyProviderPolicy.tla`, `.cfg`, `.production.cfg`	TLA+ module modeling provider/key selection lifecycle and load outcomes; production-gate enforcement; properties asserting fail-closed behavior and exact binding enforcement.
TEE enforcement modes state machine `docs/formal/models/TeeEnforcementModes.tla`, `.cfg`	TLA+ module for TE admission control with three enforcement modes, attestation states, break-glass override, and properties enforcing valid attestation and admission stability.
Formal models documentation & runner `docs/formal/models/README.md`, `scripts/formal/run_tla_models.sh`	README documenting model coverage and bounds; Bash runner downloading/verifying TLA+ toolchain and executing TLC model checks.

Design Documentation & Specifications

Layer / File(s)	Summary
ROAST protocol phase specifications `docs/roast-phase-{0-spec-freeze,1.5-consumed-registry,2-coordinator,3-transcript,4-liveness,5-baseline,5-runbook,5-security-gates}.md`	Phase 0 spec defining attempt-context, coordinator validation, strict-mode gating, and error taxonomy; Phases 1.5–5 covering consumed-registry integration, coordinator enforcement, transcript replay hardening, liveness policy with recovery classification, baseline calibration, rollout runbook, and security gates.
ROAST implementation roadmap `docs/roast-implementation-plan.md`	Complete ROAST migration plan covering baseline, goals/non-goals, design principles, threat model, target semantics (deterministic attempt/coordinator, authenticated retry, replay-safety), phased deliverables (Phase 0–5), cross-repo dependencies, and Definition-of-Done.
Hardening, security & operational plans `docs/permissioned-signer-hardening-rfc.md`, `docs/tbtc-signer-secret-material-hardening-plan.md`, `docs/tee-whitelisted-signer-enforcement-plan.md`	RFC with P0–P2 hardening phases; secret-material plan (encrypted-at-rest XChaCha20-Poly1305 envelope); TEE operator enforcement covering admission tokens, runtime behavior, break-glass, and failure modes.
API contract and design decisions `docs/signer-api-contract-decision-brief.md`, `docs/rust-rewrite-bootstrap.md`	Decision brief comparing round-level vs coarse session API (recommending session); bootstrap status documenting implemented features and production gates.
Future considerations documentation `docs/true-late-t_of_n_finalize_considerations.md`	Discussion draft on true late t-of-n finalize tradeoffs and current recommendation to defer as future consideration.

Build Configuration & Scripts

Layer / File(s)	Summary
Crate metadata and build setup `pkg/tbtc/signer/Cargo.toml`, `pkg/tbtc/signer/build.sh`, `pkg/tbtc/signer/.gitignore`	Cargo.toml defines tbtc-signer crate (edition 2021, MIT), library config (cdylib+rlib), bench-restart-hook feature, frost-secp256k1-tr dependency; build.sh with strict bash and release cargo build; `.gitignore` ignores `target/`.
Comprehensive README documentation `pkg/tbtc/signer/README.md`	Full README covering exposed C ABI endpoints, not-yet-implemented items, build/test commands, admission checker CLI, encrypted state provider contracts, benchmark/chaos suite invocation, FFI JSON contract, error codes, and buffer management.
Admission checker sample configurations `pkg/tbtc/signer/scripts/admission-*.sample.json`	JSON samples for policy (constraints, DAO overrides), candidates, existing operators, override signatures, and replay registries.
Formal verification & testing scripts `pkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjs`, `pkg/tbtc/signer/scripts/run_phase5_chaos_suite.sh`, `pkg/tbtc/signer/scripts/formal/run_tla_models.sh`	Node.js CLI for ROAST vector conformance; Bash runners for chaos suite scenarios and TLA+ model checking.

🎯 4 (Complex) | ⏱️ ~75 minutes

🐰 A bootstrap springs forth, with FFI calls so bright,
Formal models carved in TLA+ light,
ROAST coordinators dancing in Go-lang's stride,
Secrets encrypted with cryptographic pride,
From signer to harness, one system unified!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 47.28% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Title check	✅ Passed	The title clearly summarizes the main change: adding the FROST/ROAST Rust signer under tbtc/signer.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch extraction/frost-signer-mirror-2026-05-26

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

…ripts Per extraction plan v38 §4.4, allowlisted-divergence files require content normalization for the canonical context. This commit applies the transformations declared in the source manifest for the 2 signer- side allowlisted-divergence entries. Transformations - pkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjs: Rewrite vector path from `docs/frost-migration/test-vectors/roast-attempt-context-v1.json` to canonical signer layout `test/vectors/roast-attempt-context-v1.json` (both relative to rootDir which is two levels up from the script location; in canonical context that's pkg/tbtc/signer/) - pkg/tbtc/signer/scripts/formal/run_tla_models.sh: Rewrite MODEL_DIR default from `$ROOT_DIR/docs/frost-migration/formal-verification/models` to canonical signer layout `$ROOT_DIR/docs/formal/models` Plus MODELS_PATH env-var override for alternate environments (CI matrices, local dev trees). ROOT_DIR is unchanged (`$(dirname $BASH_SOURCE)/../..` resolves to pkg/tbtc/signer/ here). Verification - Both files retain identical behavior to their monorepo counterparts when invoked from canonical signer layout - Comments added documenting the path normalization with reference back to the source manifest's allowlisted-divergence status Recompute expectedTargetSha256 for both entries in the source manifest and collect dual signoff before merge per plan v38 §4.2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 13

🧹 Nitpick comments (3)

pkg/tbtc/signer/src/api.rs (1)

196-202: 💤 Low value

Missing #[serde(default, skip_serializing_if)] on script_tree_hex.

All other Option<T> fields in this file use #[serde(default, skip_serializing_if = "Option::is_none")], but script_tree_hex does not. This inconsistency means the field will serialize as null when absent, rather than being omitted.

Suggested fix

 pub struct BuildTaprootTxRequest {
     pub session_id: String,
     pub inputs: Vec<TxInput>,
     pub outputs: Vec<TxOutput>,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
     pub script_tree_hex: Option<String>,
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/tbtc/signer/src/api.rs` around lines 196 - 202, The BuildTaprootTxRequest
struct's script_tree_hex Option field lacks the serde attributes used elsewhere;
update the declaration of BuildTaprootTxRequest so the script_tree_hex field is
annotated with #[serde(default, skip_serializing_if = "Option::is_none")] to
match other Option<T> fields (preserving Clone/Debug/Deserialize/Serialize
behavior) so it is omitted from serialized output when None instead of
serializing as null.

pkg/tbtc/signer/src/lib.rs (1)

391-421: 💤 Low value

std::env::set_var and std::env::remove_var are not thread-safe.

These functions are unsound in multi-threaded contexts and deprecated since Rust 1.66. While the tests appear to serialize access via lock_test_state(), this guard must be held across all env mutations and checks within a test to prevent races with parallel test threads.

Current usage appears safe given the locking pattern, but this is fragile. Consider using a dedicated test configuration mechanism that doesn't rely on process-wide environment mutation.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/tbtc/signer/src/lib.rs` around lines 391 - 421, EnvVarGuard's methods
(EnvVarGuard::set, EnvVarGuard::unset) and its Drop rely on
std::env::set_var/remove_var which are process-wide and not thread-safe; replace
this pattern with a test-scoped, non-global solution such as using a crate that
provides scoped environment variables (e.g., temp_env or similar) or refactor
tests to accept an injected configuration object instead of mutating process
env; update usages to acquire and hold the new scoped guard for the entire
duration of tests that need env changes (or pass a Config struct into functions
under test) and remove direct calls to std::env::set_var/remove_var and the
EnvVarGuard Drop behavior to avoid races.

pkg/tbtc/signer/src/bin/admission_checker.rs (1)

257-276: 💤 Low value

Consider cleaning up the temp file if rename fails.

If fs::rename fails (e.g., cross-filesystem move or permissions issue), the temp file remains on disk. Adding a cleanup attempt in the error path would improve robustness.

♻️ Proposed cleanup on error

     fs::write(&tmp_path, serialized).map_err(|error| {
         format!(
             "failed to write override replay registry temp file [{}]: {error}",
             tmp_path.display()
         )
     })?;
-    fs::rename(&tmp_path, path).map_err(|error| {
-        format!(
-            "failed to persist override replay registry [{}]: {error}",
-            path.display()
-        )
-    })
+    fs::rename(&tmp_path, path).map_err(|error| {
+        let _ = fs::remove_file(&tmp_path); // Best-effort cleanup
+        format!(
+            "failed to persist override replay registry [{}]: {error}",
+            path.display()
+        )
+    })
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/tbtc/signer/src/bin/admission_checker.rs` around lines 257 - 276,
persist_override_replay_registry currently leaves the temporary file (tmp_path)
if fs::rename fails; modify the rename error path to attempt cleanup of tmp_path
before returning the error. Specifically, call fs::remove_file(&tmp_path)
(ignoring or logging its result) inside the Err branch that handles the rename
failure so the function still returns the original formatted error for
fs::rename but also tries to remove the leftover tmp file; reference
persist_override_replay_registry, path, tmp_path, and fs::rename when making the
change.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/tbtc/signer/docs/formal/models/README.md`:
- Around line 32-51: Update the incorrect repository paths in the traceability
matrix of pkg/tbtc/signer/docs/formal/models/README.md so links point to the
actual implementation and docs in this repo: change references to
tools/tbtc-signer/src/engine.rs to pkg/tbtc/signer/src/engine.rs for the entries
mentioning RoastAttemptStateMachine.tla (validate_attempt_context, replay
guards) and StateKeyProviderPolicy.tla (decode_encrypted_state_envelope,
encode_encrypted_state_envelope); change
docs/frost-migration/tee-whitelisted-signer-enforcement-plan.md to
pkg/tbtc/signer/docs/tee-whitelisted-signer-enforcement-plan.md for
TeeEnforcementModes.tla; and change
docs/frost-migration/roast-phase-5-security-rollout-gates.md to
pkg/tbtc/signer/docs/roast-phase-5-security-rollout-gates.md for
RoastRolloutPolicy.tla so readers can locate the referenced code and policy
docs.

In `@pkg/tbtc/signer/docs/roast-implementation-plan.md`:
- Line 93: Update the broken doc links in
pkg/tbtc/signer/docs/roast-implementation-plan.md by replacing repo-external
paths like `docs/frost-migration/roast-phase-0-spec-freeze.md` and any
`tools/tbtc-signer/...` references with the correct repo-local paths under
pkg/tbtc/signer/docs (or use correct relative paths from this markdown file);
search the file for all occurrences of `docs/frost-migration/...` and
`tools/tbtc-signer/...` (including the instances similar to the shown
`docs/frost-migration/roast-phase-0-spec-freeze.md`) and normalize each link so
it points to the new location in this package, preserving anchor fragments and
updating any link text as needed.

In `@pkg/tbtc/signer/docs/roast-phase-5-rollout-runbook.md`:
- Around line 26-29: Update the incorrect crate path used in the runbook
commands: replace occurrences of "cd tools/tbtc-signer" with "cd
pkg/tbtc/signer" for the benchmark command (`cargo bench --features
bench-restart-hook --bench phase5_roast`) and the chaos suite script invocation
(`./scripts/run_phase5_chaos_suite.sh`) so the commands run from the correct
crate directory.

In `@pkg/tbtc/signer/docs/roast-phase-5-security-rollout-gates.md`:
- Around line 92-99: Update stale repository paths in
roast-phase-5-security-rollout-gates.md: replace any occurrences of the old
tooling path "tools/tbtc-signer" with the new location "pkg/tbtc/signer" (e.g.,
update the run command `cd tools/tbtc-signer && cargo bench --features
bench-restart-hook --bench phase5_roast` to `cd pkg/tbtc/signer ...`), and
update any links referencing
`docs/frost-migration/roast-phase-5-baseline-calibration.md` to the document’s
new location in the repo (search for the exact link text and swap to the correct
path). Ensure all instances at the reported locations (around the run command
and the listed links) are changed consistently so operators following the
runbook hit the correct files and commands.

In `@pkg/tbtc/signer/docs/rust-rewrite-bootstrap.md`:
- Around line 10-13: The documentation still references the old crate path
"tools/tbtc-signer" and the validation command using that path; update every
occurrence to "pkg/tbtc/signer" in rust-rewrite-bootstrap.md (including the
header lines that list the crate, the C ABI include path `include/frost_tbtc.h`,
and any validation/build commands) so links and commands point to the colocated
pkg/tbtc/signer location; search for "tools/tbtc-signer" and replace with
"pkg/tbtc/signer" and verify the validation command and any examples reference
the new path.

In `@pkg/tbtc/signer/docs/signer-api-contract-decision-brief.md`:
- Around line 43-47: Update the two referenced paths in
signer-api-contract-decision-brief.md so they point to the mirrored crate
locations: replace `docs/frost-migration/rust-rewrite-bootstrap.md` with
`pkg/tbtc/signer/docs/frost-migration/rust-rewrite-bootstrap.md` and replace
`tools/tbtc-signer/src/lib.rs` with `pkg/tbtc/signer/src/lib.rs` (look for the
occurrences shown around the paragraph mentioning the bootstrap Rust crate and
file: `tools/tbtc-signer/src/lib.rs` and update those strings accordingly).

In `@pkg/tbtc/signer/docs/tbtc-signer-secret-material-hardening-plan.md`:
- Around line 6-7: The doc still uses the old crate path string
`tools/tbtc-signer`; update that scope reference to `pkg/tbtc/signer` throughout
the file (tbtc-signer-secret-material-hardening-plan.md) so the plan points to
the mirrored crate location, and scan for any other occurrences of
`tools/tbtc-signer` in this document to replace with `pkg/tbtc/signer`.

In `@pkg/tbtc/signer/docs/tee-whitelisted-signer-enforcement-plan.md`:
- Around line 296-298: Update the two broken cross-doc links in
pkg/tbtc/signer/docs/tee-whitelisted-signer-enforcement-plan.md (currently
referencing docs/frost-migration/roast-phase-5-security-rollout-gates.md and
docs/frost-migration/roast-phase-5-rollout-runbook.md on lines ~296–297) to
point to their correct locations inside this PR:
pkg/tbtc/signer/docs/roast-phase-5-security-rollout-gates.md and
pkg/tbtc/signer/docs/roast-phase-5-rollout-runbook.md so cross-document
navigation resolves correctly.

In `@pkg/tbtc/signer/README.md`:
- Around line 53-54: Update the README.md occurrence(s) that reference the old
path string "tools/tbtc-signer" to the new crate location "pkg/tbtc/signer":
search for and replace that path in all command snippets, code blocks, and file
references (e.g., cargo build/cd commands and any path bullets) so every
instance uses "pkg/tbtc/signer" consistently; ensure both shell commands and
prose file paths are updated, and run a quick grep for "tools/tbtc-signer" to
confirm no remaining references.

In `@pkg/tbtc/signer/scripts/admission-policy-v1.sample.json`:
- Line 8: Replace the non-hex placeholder value for the JSON key
dao_override_trust_root_pubkey_hex with a syntactically valid 32-byte hex string
(64 lowercase hex characters) so sample files and copy/paste validation don't
break; update the sample value to something like a 64-character hex placeholder
and leave replacement guidance in the README or nearby comment explaining it
must be replaced with the real x-only pubkey hex.

In `@pkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjs`:
- Around line 15-18: vectorsPath currently resolves to
"docs/frost-migration/test-vectors/roast-attempt-context-v1.json" and will fail
because the vectors live under "test/vectors"; update the path.join call that
constructs vectorsPath (using rootDir and the filename) to point to
"test/vectors/roast-attempt-context-v1.json" so the script loads the correct
file; ensure the change is made where vectorsPath is declared and used in this
module.

In `@pkg/tbtc/signer/scripts/formal/run_tla_models.sh`:
- Around line 4-5: The MODEL_DIR assignment in run_tla_models.sh is pointing to
the wrong path; update the MODEL_DIR variable (currently computed relative to
ROOT_DIR) to "$ROOT_DIR/docs/formal/models" so the directory existence check in
the script (around the directory existence test near line 20) succeeds; modify
the MODEL_DIR definition in the script (look for the MODEL_DIR variable
assignment and references) to use the corrected path and ensure any subsequent
uses of MODEL_DIR still reference this updated variable.

In `@pkg/tbtc/signer/tests/p2tr_signature_fraud_vectors.rs`:
- Around line 375-376: The test constructs vectors_path using the vectors_path
variable in pkg/tbtc/signer/tests/p2tr_signature_fraud_vectors.rs which
currently joins
"../../docs/frost-migration/test-vectors/p2tr-signature-fraud-v0.json" relative
to CARGO_MANIFEST_DIR and points to a non-existent file; fix by either adding
the missing p2tr-signature-fraud-v0.json to the repo at that path or update the
vectors_path join to the correct relative path where the JSON actually lives
(adjust the "../" segments or point to the canonical test-vectors location),
ensuring the variable name vectors_path and its usage remain unchanged.

---

Nitpick comments:
In `@pkg/tbtc/signer/src/api.rs`:
- Around line 196-202: The BuildTaprootTxRequest struct's script_tree_hex Option
field lacks the serde attributes used elsewhere; update the declaration of
BuildTaprootTxRequest so the script_tree_hex field is annotated with
#[serde(default, skip_serializing_if = "Option::is_none")] to match other
Option<T> fields (preserving Clone/Debug/Deserialize/Serialize behavior) so it
is omitted from serialized output when None instead of serializing as null.

In `@pkg/tbtc/signer/src/bin/admission_checker.rs`:
- Around line 257-276: persist_override_replay_registry currently leaves the
temporary file (tmp_path) if fs::rename fails; modify the rename error path to
attempt cleanup of tmp_path before returning the error. Specifically, call
fs::remove_file(&tmp_path) (ignoring or logging its result) inside the Err
branch that handles the rename failure so the function still returns the
original formatted error for fs::rename but also tries to remove the leftover
tmp file; reference persist_override_replay_registry, path, tmp_path, and
fs::rename when making the change.

In `@pkg/tbtc/signer/src/lib.rs`:
- Around line 391-421: EnvVarGuard's methods (EnvVarGuard::set,
EnvVarGuard::unset) and its Drop rely on std::env::set_var/remove_var which are
process-wide and not thread-safe; replace this pattern with a test-scoped,
non-global solution such as using a crate that provides scoped environment
variables (e.g., temp_env or similar) or refactor tests to accept an injected
configuration object instead of mutating process env; update usages to acquire
and hold the new scoped guard for the entire duration of tests that need env
changes (or pass a Config struct into functions under test) and remove direct
calls to std::env::set_var/remove_var and the EnvVarGuard Drop behavior to avoid
races.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c0224eef-499a-42a2-a346-f06ef353278b

📥 Commits

Reviewing files that changed from the base of the PR and between 980587b and 335ce60.

⛔ Files ignored due to path filters (1)

pkg/tbtc/signer/Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (48)

pkg/tbtc/signer/.gitignore
pkg/tbtc/signer/Cargo.toml
pkg/tbtc/signer/README.md
pkg/tbtc/signer/benches/phase5_roast.rs
pkg/tbtc/signer/build.sh
pkg/tbtc/signer/docs/formal/models/README.md
pkg/tbtc/signer/docs/formal/models/RoastAttemptStateMachine.cfg
pkg/tbtc/signer/docs/formal/models/RoastAttemptStateMachine.tla
pkg/tbtc/signer/docs/formal/models/RoastRolloutPolicy.cfg
pkg/tbtc/signer/docs/formal/models/RoastRolloutPolicy.tla
pkg/tbtc/signer/docs/formal/models/StateKeyProviderPolicy.cfg
pkg/tbtc/signer/docs/formal/models/StateKeyProviderPolicy.production.cfg
pkg/tbtc/signer/docs/formal/models/StateKeyProviderPolicy.tla
pkg/tbtc/signer/docs/formal/models/TeeEnforcementModes.cfg
pkg/tbtc/signer/docs/formal/models/TeeEnforcementModes.tla
pkg/tbtc/signer/docs/permissioned-signer-hardening-rfc.md
pkg/tbtc/signer/docs/roast-implementation-plan.md
pkg/tbtc/signer/docs/roast-phase-0-spec-freeze.md
pkg/tbtc/signer/docs/roast-phase-1.5-consumed-registry-integration.md
pkg/tbtc/signer/docs/roast-phase-2-coordinator-policy-enforcement.md
pkg/tbtc/signer/docs/roast-phase-3-attempt-transcript-replay-hardening.md
pkg/tbtc/signer/docs/roast-phase-4-liveness-policy-recovery.md
pkg/tbtc/signer/docs/roast-phase-5-baseline-calibration.md
pkg/tbtc/signer/docs/roast-phase-5-rollout-runbook.md
pkg/tbtc/signer/docs/roast-phase-5-security-rollout-gates.md
pkg/tbtc/signer/docs/rust-rewrite-bootstrap.md
pkg/tbtc/signer/docs/signer-api-contract-decision-brief.md
pkg/tbtc/signer/docs/tbtc-signer-secret-material-hardening-plan.md
pkg/tbtc/signer/docs/tee-whitelisted-signer-enforcement-plan.md
pkg/tbtc/signer/docs/true-late-t-of-n-finalize-considerations.md
pkg/tbtc/signer/include/frost_tbtc.h
pkg/tbtc/signer/scripts/admission-candidate.sample.json
pkg/tbtc/signer/scripts/admission-existing.sample.json
pkg/tbtc/signer/scripts/admission-override-registry.sample.json
pkg/tbtc/signer/scripts/admission-override.sample.json
pkg/tbtc/signer/scripts/admission-policy-v1.sample.json
pkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjs
pkg/tbtc/signer/scripts/formal/run_tla_models.sh
pkg/tbtc/signer/scripts/run_phase5_chaos_suite.sh
pkg/tbtc/signer/src/api.rs
pkg/tbtc/signer/src/bin/admission_checker.rs
pkg/tbtc/signer/src/engine.rs
pkg/tbtc/signer/src/errors.rs
pkg/tbtc/signer/src/ffi.rs
pkg/tbtc/signer/src/go_math_rand.rs
pkg/tbtc/signer/src/lib.rs
pkg/tbtc/signer/test/vectors/roast-attempt-context-v1.json
pkg/tbtc/signer/tests/p2tr_signature_fraud_vectors.rs

Adds a focused workflow that runs the Rust signer's formal-invariant test suite + TLA model checks. Moved from threshold-network/tbtc-v2/.github/workflows/ci-formal-verification.yml (jobs `signer-formal-invariants` + `tla-model-checks`) per extraction plan v38 §3.1 — the signer code lives here at pkg/tbtc/signer/, not in tbtc-v2, so the CI jobs that exercise it belong here too. Jobs - signer-formal-invariants: cargo test --manifest-path pkg/tbtc/signer/ Cargo.toml formal_verification_ (filter to formal-only cases) - tla-model-checks: pkg/tbtc/signer/scripts/formal/run_tla_models.sh (iterates over .cfg files in pkg/tbtc/signer/docs/formal/models/ and runs TLC against each; MODELS_PATH env var allows override per the path-normalization commit b84b574c on this branch) Triggers - pull_request on pkg/tbtc/signer/** changes + this workflow file - schedule nightly at 05:23 UTC (mirrors monorepo's pattern of running formal invariants both on PRs and nightly) - workflow_dispatch for manual runs Related changes in companion PR threshold-network/tbtc-v2#971: - Removed these jobs from canonical tbtc-v2's ci-formal-verification.yml - Added a comment in that file pointing here Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/tbtc-signer-formal.yml:
- Around line 25-26: The checkout steps using actions/checkout@v4 persist git
credentials by default; update each Checkout step (the uses: actions/checkout@v4
entries) to add with: persist-credentials: false so credentials are not stored
in the runner after checkout. Ensure both occurrences of actions/checkout@v4 in
the workflow are modified accordingly.
- Line 26: Update the GitHub Actions workflow to pin action versions and harden
checkout credentials: replace occurrences of actions/checkout@v4 with
actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 and add with:
persist-credentials: false to both checkout steps that use checkout, replace
dtolnay/rust-toolchain@stable with
dtolnay/rust-toolchain@29eef336d9b2848a0b548edc03f92a220660cdb8, and replace
actions/setup-java@v4 with
actions/setup-java@c1e323688fd81a25caa38c78aa6df2d33d3e20d9 so the workflow pins
SHA-based commits and disables persisting credentials on checkout.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 88553e79-6db4-4e95-98fe-00c56e1bc640

📥 Commits

Reviewing files that changed from the base of the PR and between bfd7658 and 551bd42.

📒 Files selected for processing (1)

.github/workflows/tbtc-signer-formal.yml

coderabbitai · 2026-05-26T16:17:49Z

+      - name: Checkout
+        uses: actions/checkout@v4


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE=".github/workflows/tbtc-signer-formal.yml" echo "== File exists? ==" ls -l "$FILE" echo "== Lines 1-120 with numbers ==" nl -ba "$FILE" | sed -n '1,120p'

Repository: threshold-network/keep-core

Length of output: 272

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE=".github/workflows/tbtc-signer-formal.yml" echo "== File exists? ==" ls -l "$FILE" echo "== Lines 1-120 with numbers ==" nl -ba "$FILE" | sed -n '1,120p'

Repository: threshold-network/keep-core

Length of output: 272

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE=".github/workflows/tbtc-signer-formal.yml" echo "== Showing lines 1-80 ==" awk 'NR>=1 && NR<=80 {printf "%4d: %s\n", NR, $0}' "$FILE" echo "== Showing checkout-related lines (search) ==" rg -n "name:\s*Checkout|uses:\s*actions/checkout|persist-credentials" "$FILE" || true

Repository: threshold-network/keep-core

Length of output: 2302

Set persist-credentials: false on both actions/checkout@v4 steps

The current workflow checks out without disabling persisted git credentials; add with: persist-credentials: false to reduce credential exposure.

Suggested patch

- name: Checkout uses: actions/checkout@v4 + with: + persist-credentials: false @@ - name: Checkout uses: actions/checkout@v4 + with: + persist-credentials: false

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- name: Checkout

uses: actions/checkout@v4

- name: Checkout

uses: actions/checkout@v4

with:

persist-credentials: false

🧰 Tools

🪛 zizmor (1.25.2)

[warning] 25-26: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)

[error] 26-26: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/tbtc-signer-formal.yml around lines 25 - 26, The checkout steps using actions/checkout@v4 persist git credentials by default; update each Checkout step (the uses: actions/checkout@v4 entries) to add with: persist-credentials: false so credentials are not stored in the runner after checkout. Ensure both occurrences of actions/checkout@v4 in the workflow are modified accordingly.

coderabbitai · 2026-05-26T16:17:50Z

+    timeout-minutes: 30
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail for repo in actions/checkout actions/setup-java dtolnay/rust-toolchain; do echo "=== $repo ===" git ls-remote "https://github.com/${repo}.git" | head -n 5 echo done

Repository: threshold-network/keep-core

Length of output: 535

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Inspect workflow around the referenced lines echo "=== Workflow file context ===" nl -ba .github/workflows/tbtc-signer-formal.yml | sed -n '1,120p' # 2) Resolve refs to commit SHAs (avoid SIGPIPE by capturing then slicing) resolve_ref() { local repo="$1" ref="$2" local url="https://github.com/${repo}.git" local tmp="/tmp/lsremote_${repo//\//_}_${ref//\//_}" git ls-remote "$url" "$ref" > "$tmp" 2>/dev/null || true echo "--- ${repo} ${ref} ---" cat "$tmp" || true echo rm -f "$tmp" || true } echo "=== Resolve action refs to SHAs ===" resolve_ref "actions/checkout" "refs/tags/v4" resolve_ref "actions/checkout" "refs/heads/v4" resolve_ref "actions/setup-java" "refs/tags/v4" resolve_ref "actions/setup-java" "refs/heads/v4" resolve_ref "dtolnay/rust-toolchain" "refs/heads/stable" resolve_ref "dtolnay/rust-toolchain" "refs/tags/stable"

Repository: threshold-network/keep-core

Length of output: 162

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Workflow file context (numbered) ===" python3 - <<'PY' from pathlib import Path p = Path(".github/workflows/tbtc-signer-formal.yml") lines = p.read_text().splitlines() start, end = 1, min(len(lines), 140) for i in range(start, end+1): print(f"{i}\t{lines[i-1]}") PY echo echo "=== Resolve action refs to commit SHAs ===" resolve_ref() { local repo="$1" ref="$2" local url="https://github.com/${repo}.git" echo "--- ${repo} ${ref} ---" git ls-remote "$url" "$ref" || true echo } resolve_ref "actions/checkout" "refs/tags/v4" resolve_ref "actions/checkout" "refs/heads/v4" resolve_ref "actions/setup-java" "refs/tags/v4" resolve_ref "actions/setup-java" "refs/heads/v4" resolve_ref "dtolnay/rust-toolchain" "refs/heads/stable" resolve_ref "dtolnay/rust-toolchain" "refs/tags/stable"

Repository: threshold-network/keep-core

Length of output: 2438

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "Total lines: $(python3 - <<'PY'\nimport pathlib\np=pathlib.Path('.github/workflows/tbtc-signer-formal.yml')\nprint(len(p.read_text().splitlines()))\nPY)" echo echo "=== uses: occurrences ===" rg -n "uses:" .github/workflows/tbtc-signer-formal.yml || true echo echo "=== persist-credentials occurrences ===" rg -n "persist-credentials" .github/workflows/tbtc-signer-formal.yml || true

Repository: threshold-network/keep-core

Length of output: 133

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "Total lines: $(python3 - <<'PY' import pathlib p=pathlib.Path('.github/workflows/tbtc-signer-formal.yml') print(len(p.read_text().splitlines())) PY)" echo echo "=== uses: occurrences ===" rg -n "uses:" .github/workflows/tbtc-signer-formal.yml || true echo echo "=== persist-credentials occurrences ===" rg -n "persist-credentials" .github/workflows/tbtc-signer-formal.yml || true

Repository: threshold-network/keep-core

Length of output: 424

Pin GitHub Actions to commit SHAs and harden checkout credentials.

actions/checkout@v4 (lines 26, 44) -> actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5; also add with: persist-credentials: false to both checkout steps.

dtolnay/rust-toolchain@stable (line 29) -> dtolnay/rust-toolchain@29eef336d9b2848a0b548edc03f92a220660cdb8.

actions/setup-java@v4 (line 47) -> actions/setup-java@c1e323688fd81a25caa38c78aa6df2d33d3e20d9.

🧰 Tools

🪛 zizmor (1.25.2)

[warning] 25-26: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)

[error] 26-26: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/tbtc-signer-formal.yml at line 26, Update the GitHub Actions workflow to pin action versions and harden checkout credentials: replace occurrences of actions/checkout@v4 with actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 and add with: persist-credentials: false to both checkout steps that use checkout, replace dtolnay/rust-toolchain@stable with dtolnay/rust-toolchain@29eef336d9b2848a0b548edc03f92a220660cdb8, and replace actions/setup-java@v4 with actions/setup-java@c1e323688fd81a25caa38c78aa6df2d33d3e20d9 so the workflow pins SHA-based commits and disables persisting credentials on checkout.

Resolves CI failures on PR #4005 (signer mirror): 1. TLA model checks: run_tla_models.sh lacked executable bit at canonical HEAD. CI ran the script directly (no `bash` prefix), which fails with `Permission denied`. Fixed via `git update-index --chmod=+x`. 2. Signer formal invariants: engine.rs's formal_verification_roast_attempt_context_shared_vectors_match_ expected_values test referenced vectors at a path stale from the umbrella's docs/frost-migration/test-vectors/ layout. The manifest places the vector at the canonical-signer test/vectors/ subdir (pkg/tbtc/signer/test/vectors/roast-attempt-context-v1.json per the source-to-target map). Updated the `PathBuf::from(env!("CARGO_MANIFEST_DIR")).join(...)` argument from `../../docs/frost-migration/test-vectors/roast-attempt-context-v1.json` (umbrella-relative) to `test/vectors/roast-attempt-context-v1.json` (signer-CARGO_MANIFEST_DIR-relative, where the vector actually lives at canonical HEAD). Verified locally: - ls -l shows executable bit set on run_tla_models.sh - engine.rs path now resolves to the correct mirror location - Vector exists at pkg/tbtc/signer/test/vectors/roast-attempt-context-v1.json Same fix needs to be applied to PR #4007 (stacked on #4005) in a follow-up commit on its branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

PR #4005 signer formal invariants test formal_verification_p2tr_signature_fraud_vectors_match_bitcoin_crate was failing because: 1. The umbrella source manifest only mapped p2tr-signature-fraud-v0.json to the tbtc-v2 target (docs/test-vectors/). It was not mirrored to the keep-core (signer) target. 2. tests/p2tr_signature_fraud_vectors.rs referenced the vector at the umbrella-relative path `../../docs/frost-migration/test-vectors/p2tr-signature-fraud-v0.json` (CARGO_MANIFEST_DIR-relative -> repo-root + docs/frost-migration/...). That directory does not exist on canonical keep-core. This is a structural omission in the manifest, not a divergence: the cross-language vector test exists in both Solidity (tbtc-v2 side) and Rust (signer side), and the vector is genuinely needed in both places to verify cross-implementation consistency. Fix - Mirror p2tr-signature-fraud-v0.json (598 lines, byte-identical content from tbtc-v2 mirror at docs/test-vectors/) to pkg/tbtc/signer/test/vectors/p2tr-signature-fraud-v0.json. - Update the test path in tests/p2tr_signature_fraud_vectors.rs from `../../docs/frost-migration/test-vectors/p2tr-signature-fraud-v0.json` to `test/vectors/p2tr-signature-fraud-v0.json` (CARGO_MANIFEST_DIR = pkg/tbtc/signer/, so test/vectors/... resolves correctly). Two stale comment references remain in - pkg/tbtc/signer/docs/roast-implementation-plan.md:265 - pkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjs:19 Both are comment-only doc pointers to the source layout; they do not affect runtime. Left as-is to preserve the umbrella -> canonical provenance trail. Manifest update follows in a stacked PR on tlabs-xyz/tbtc#10: - Add p2tr-signature-fraud-v0.json -> signer target mapping (test/vectors/p2tr-signature-fraud-v0.json). - Reclassify tests/p2tr_signature_fraud_vectors.rs from mirror to allowlisted-divergence (path differs from umbrella). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Stacked on extraction/frost-signer-mirror-2026-05-26 / PR #4005. Adds optional taproot_merkle_root_hex to start/finalize signing rounds, binds it into request fingerprints and round IDs, signs and aggregates with frost-secp256k1-tr Taproot tweaks, and verifies tweaked aggregates in tests. Verification: cargo test in pkg/tbtc/signer.

…ge (re-review) Codex re-review on the Round2 completion gate: [P1] The gate trusted aggregated_interactive_attempt_markers keyed by attempt_id alone, but interactive_aggregate never binds attempt_id to the aggregated package/ message - it only checks the shares aggregate under the session key. So a caller with any valid aggregate could replay it under a DIFFERENT live attempt's id, set that marker, and the new gate would then refuse the real Round2 before its nonce is consumed (an availability/DoS regression the single-member engine never had). Fix: make the completion marker MESSAGE-BOUND - interactive_aggregated_marker = "{attempt_id}@{message_digest}". interactive_aggregate writes it from the package it actually aggregated (all five marker sites: both completion pre-checks, the capacity guard, the insert, and the persist-rollback). The Round2 gate recomputes it from the message THIS member opened, so it preempts Round2 only for a genuine same-message completion; a replayed aggregate carrying a different message sets an unrelated marker that cannot match. Same-message re-aggregation is still rejected (existing restart/repeat tests pass unchanged - they re-aggregate the same message). [P2] On that genuine completion rejection, the unsigned sibling's entry is now dead, so free it (zeroizing its nonces) instead of letting it hold a live-member cap slot until the TTL sweep. Only the matching member's entry is removed. Tests: the multi-seat refusal test now also asserts the marker is bound (attempt_id@digest, not the bare id) and that the refused sibling's entry is freed. All 294 lib tests pass; cargo fmt clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…review) Codex re-review (P2, valid): the previous commit made the aggregate completion marker message-bound (attempt_id@digest), but a completion persisted by the pre-binding engine is stored as the BARE attempt_id. The new checks only looked for the bound form, so after an upgrade a previously completed attempt looked unfinished - repeat InteractiveAggregate and the Round2 completion gate would no longer fail closed for it. Add interactive_attempt_aggregated(markers, attempt_id, digest) = bound form OR a legacy bare attempt_id marker (fail-closed on read), mirroring the consumed-marker helper. Use it at the three read sites (the Round2 completion gate and both interactive_aggregate pre-checks). New writes stay bound-only, so the durable record migrates forward on the next completion while legacy completions stay final. Test: interactive_honors_legacy_bare_aggregate_completion_marker injects a bare attempt_id marker (a pre-upgrade completion) and asserts Round2 then fails closed with InteractiveAttemptAlreadyAggregated. All 295 lib tests pass; cargo fmt clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…e-review) Codex re-review (P1, valid): attempt_id is derived from session/message/attempt#/ coordinator/included - NOT the taproot root. So the message-bound completion marker (attempt_id@message_digest) still collided across taproot tweaks: a completion aggregated for one root (or key-path None) recorded the same marker the Round2 gate checks for a live seat opened with a different root, wrongly preempting that seat's Round2 (and zeroizing it) even though the signatures differ per tweak. Bind the canonical taproot root into the marker too - interactive_aggregated_marker(attempt_id, message_digest, taproot_root) = "{attempt_id}@{message_digest}@{root}" (root = hex, or "keypath" for None, which cannot collide with a 64-hex root). interactive_aggregate writes it from the root it aggregated under; the Round2 gate recomputes it from the root THIS member opened with. The completion marker now binds the full signing-task identity (attempt + msg + root); the legacy bare-id fallback is unchanged. Test interactive_round2_completion_marker_binds_taproot_root: a completion recorded for a different root does not preempt a key-path member's Round2, while the same-root completion does. All 296 lib tests pass; cargo fmt clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…e-review) Codex re-review (P2, valid): when a multi-seat attempt aggregates with a threshold subset that EXCLUDES a local member which opened/Round1'd the same attempt + root, that member never calls Round2 (it is not a signer), so the Round2 completion gate never runs for it - its interactive_signing entry (nonces, key, message) and its live-member capacity slot stayed resident until the 1h TTL or an explicit abort. interactive_aggregate's success path now frees the LOCAL siblings finalized by the completion: after persisting the marker, remove + zeroize every interactive_signing entry on (attempt_id, taproot root) - the signers' entries were already removed at their Round2, and a sibling on a DIFFERENT root is a distinct signing task and is left untouched. The Round2 completion gate stays as the defense for a sibling that RE-OPENS the finalized attempt and tries to sign. Test interactive_round2_refused_after_aggregate_for_unsigned_sibling now asserts the non-signing sibling is freed at aggregation, then re-opens it and confirms the gate still refuses its Round2. All 296 lib tests pass; cargo fmt clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…o (re-review) Codex re-review (P2, valid): the finalized-sibling cleanup added in the previous commit matched entries on (attempt_id, taproot root) but NOT the message. Because attempt_id is provided by the caller separately from the package, a mismatched aggregate - a valid signing package for message B submitted under a live message-A attempt's id (same root) - would delete the message-A seats' live nonce/commitment state, forcing that unrelated attempt to restart, even though the stored marker (now message-bound) correctly would not match its Round2. Match the FULL finalized identity in the cleanup filter: attempt_id AND hash_hex(entry.message_bytes) == aggregated_message_digest AND taproot root - the same (attempt + message + root) identity the completion marker binds. Test interactive_aggregate_cleanup_is_message_bound: a valid stateless aggregate over message B under message A's attempt id leaves the live message-A seat intact. All 297 lib tests pass; cargo fmt clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…t operators (#4098) ## What Keys a session's interactive signing state by `member_identifier` (`interactive_signing: Option<InteractiveSigningState>` → `BTreeMap<u16, InteractiveSigningState>`) so a **multi-seat operator can run interactive signing for all its seats in one process**. ## Why A multi-seat operator runs concurrent interactive signing per seat, all sharing one member-independent `SessionID`, in one process (the engine state is a process-global singleton). Today that breaks three ways: the 2nd seat's `InteractiveSessionOpen(same session, different member)` → `SessionConflict`; `round2` frees the whole state (wiping siblings); and the per-attempt consumed-nonce marker makes a sibling's `round2` falsely trip `ConsumedNonceReplay`. Surfaced by the keep-core real-cgo e2e (#4097). ## Design (reviewed: Gemini sound, Codex approve-with-refinements) Both reviewers rejected a session-wide live-attempt model — it would let one seat erase another's nonces, the exact bug being removed. - **Per-member live attempt**: `open(M, strictly-newer attempt)` replaces only M's entry (zeroizing M's old nonces); a stale/equal open for M is rejected, but a sibling on a different attempt is untouched. Seats are independent — exactly as separate processes would be. - **round2** removes only the member's entry (siblings stay live); the durable marker carries replay protection. - **Consumed markers** keyed per-`(attempt_id, member_id)` (composite); a **legacy bare `attempt_id` marker is honored fail-closed** on read. `aggregated_interactive_attempt_markers` stay per-attempt (one signature per attempt over public data). - **abort** is session-level over the map (removes all matching entries); **TTL sweep** is per-entry by `opened_at_unix` (siblings survive); **capacity** counts live member entries (a new member is a slot; a same-member replacement isn't). - `interactive_state_for_attempt_mut` is a member-keyed lookup. Live state still never persists (empty map on reload). Zeroization preserved via `InteractiveRound1State::Drop`. ## Tests New `interactive_multi_seat_two_members_one_process_aggregate_bip340`: two seats open the same attempt in one process, Round1 independently, member 1's Round2 frees only its entry and writes only its marker (member 2 stays live and unblocked), and the two interactive shares **aggregate to a valid BIP-340 signature**. Existing single-member tests updated for the map. **All 292 lib tests pass; `cargo fmt` clean.** Follow-ups (additional edge tests Codex itemized — per-member attempt advance, abort-by-attempt over multiple members, capacity new-vs-replacement) can land in review. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

…attempt edge cases Two follow-up edge tests Codex itemized for the multi-seat interactive engine fix (#4098), test-only: - interactive_capacity_counts_new_members_not_replacements: at a live-member cap of 1, a member advancing to a newer attempt replaces its own entry (no new slot, succeeds), while a DIFFERENT member is a new entry that trips the cap and fails closed - pinning that the cap counts member entries, not sessions. - interactive_abort_by_attempt_removes_all_members_on_that_attempt: abort with an attempt_id filter is session-level over the member map - it removes every local seat on that attempt while a sibling seat on a different attempt survives. All 299 lib tests pass; cargo fmt clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…attempt edge cases (#4100) ## What Two follow-up edge tests Codex itemized during the multi-seat interactive engine fix (#4098), test-only: - **`interactive_capacity_counts_new_members_not_replacements`** — at a live-member cap of 1: a member advancing to a newer attempt *replaces* its own entry (no new slot, succeeds), while a *different* member is a new entry that trips the cap and fails closed. Pins that the cap counts member entries, not sessions. - **`interactive_abort_by_attempt_removes_all_members_on_that_attempt`** — abort with an `attempt_id` filter is session-level over the member map: it removes every local seat on that attempt while a sibling seat on a *different* attempt survives. All 299 lib tests pass; `cargo fmt` clean. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

…tc_abi_version) The Go cgo bridge and libfrost_tbtc are compiled separately and linked at runtime via dlsym; a symbol that resolves but changed MEANING (struct/JSON contract change) is silent corruption. The existing frost_tbtc_version returns a human string ("tbtc-signer/0.1.0-bootstrap") that cannot be machine-checked. Add frost_tbtc_abi_version() returning a structured FFI CONTRACT version {abi_major, abi_minor} (api::FrostTbtcAbiVersionResult), so a consumer can fail closed against an incompatible lib (the Go-side assertion lands separately). Starts at 1.0 - NOT 0.x, to avoid semver pre-1.0 ambiguity. abi_major = any incompatible contract change (C signatures, JSON field meaning, required fields, enum/status values, serialization, memory ownership, crypto transcript semantics); abi_minor = a purely additive, backward-compatible change old consumers safely ignore. A consumer requires lib.abi_major == its_major AND lib.abi_minor >= the minor it needs. The human frost_tbtc_version string stays informational and unchanged. Design validated via Codex consult (major.minor; exact major, minimum minor; the {abi_major, abi_minor} JSON is the frozen root compatibility surface). 300 lib tests pass (new: abi_version_reports_the_contract_version pins 1.0); cargo fmt clean; the symbol is exported in the built cdylib. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Codex #4104 P2: the new #[no_mangle] frost_tbtc_abi_version is part of the public C ABI, but header-based consumers that compile against include/frost_tbtc.h had no prototype for it - the exported symbol and the advertised header could drift. Add the declaration alongside frost_tbtc_version (the header is hand-maintained; no cbindgen). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…tc_abi_version) (#4104) ## What Adds `frost_tbtc_abi_version()` — a structured **FFI contract version** `{abi_major, abi_minor}` — so the Go cgo bridge can **fail closed** against an incompatible `libfrost_tbtc` instead of silently misinterpreting a changed contract. The bridge and the lib are compiled separately and linked at runtime via `dlsym`; a symbol that still resolves but changed *meaning* (struct/JSON contract change) is silent corruption. The existing `frost_tbtc_version` returns a human string (`tbtc-signer/0.1.0-bootstrap`) that can't be machine-checked — this is the checkable companion. ## Scheme (Codex-validated) - **`abi_major`** — any *incompatible* change to the Go↔Rust contract: C signatures, JSON field meaning, required fields, enum/status values, serialization, memory ownership, crypto transcript semantics. - **`abi_minor`** — a purely *additive*, backward-compatible change old consumers safely ignore. - A consumer requires `lib.abi_major == its_major` **and** `lib.abi_minor >= the minor it needs`. - Starts at **1.0** (not 0.x, to avoid semver pre-1.0 ambiguity). The `{abi_major, abi_minor}` JSON is the **frozen root compatibility surface**. - `frost_tbtc_version` (human string) stays informational and unchanged. ## Notes - The **Go-side assertion** (fail-closed preflight + the `requiredMajor`/`requiredMinMinor` constants + a CI-gate check) lands as a separate keep-core PR that also bumps the signer pin to this commit. - 300 lib tests pass (new `abi_version_reports_the_contract_version` pins 1.0); `cargo fmt` clean; the symbol is exported in the built cdylib. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

…ctive round start_sign_round cleared the active sign round on an authorized advance *before* validating the incoming attempt context against the deterministic RFC-21 coordinator selection. A malformed advance whose transition evidence was internally consistent but whose coordinator_identifier failed deterministic validation destroyed the in-memory round, then returned an error without persisting. Because the original attempt id stayed in consumed_attempt_ids, that attempt could never be re-signed in-memory, bricking the signing session until the durable (un-cleared) state was reloaded on restart. Run validate_attempt_context on the incoming context before clear_active_sign_round_for_attempt_transition, so a rejected advance leaves the active round intact. Add a regression test that forges the coordinator on an otherwise-valid advance and asserts the original attempt remains signable. This path is gated off in production by enforce_transitional_signing_disabled_in_production, so the impact is limited to the dev/staging transitional-nonce path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ng window Two fail-open / DoS fixes in the signing-policy config surface: - signer_profile_is_production() panicked on any TBTC_SIGNER_PROFILE value other than production/development/empty. On the env-fallback path that value is unvalidated, so a single typo (e.g. "staging") aborted every profile-gated FFI operation as a generic internal error -- a process-wide denial of service. Treat an unrecognized profile as production (fail closed) and warn once instead of panicking. - load_signing_policy_firewall_config() accepted an equal-bounds UTC window (start == end). utc_hour_in_window treats start == end as "always in window", so the time-of-day control was silently disabled (fail open). Reject equal bounds, and out-of-range (>= 24) hours, at load time. Add regression tests for both. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The signer-dependency-audit job runs cargo-deny against the committed Cargo.lock, but the clippy/test jobs and build.sh resolved dependencies without --locked. A PR that edits Cargo.toml so the committed lock no longer satisfies it would let those jobs silently re-resolve to newer, unaudited crate versions (running their build scripts and proc-macros on the runner) while the advisory gate still audits the stale lock -- so the security gate and the code actually exercised could diverge. Add --locked to the clippy, test, formal-test, and release-build invocations so a lock/manifest mismatch fails loudly instead of silently re-resolving. cargo fmt is intentionally left unchanged (it does not accept --locked). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ctive round (#4111) ## Summary Follow-up to #4005 (FROST/ROAST signer). Fixes a session-bricking bug in the attempt-advance path of `start_sign_round`. On an authorized attempt advance, `start_sign_round` cleared the active sign round (`clear_active_sign_round_for_attempt_transition`) **before** validating the incoming attempt context against the deterministic RFC-21 coordinator selection. A malformed advance whose transition evidence was internally consistent but whose `coordinator_identifier` failed deterministic validation destroyed the in-memory round, then returned an error **without persisting**. Because the original attempt id stayed in `consumed_attempt_ids`, that attempt could never be re-signed in-memory — the signing session was bricked until the durable (un-cleared) state was reloaded on restart. ## Fix Run `validate_attempt_context` on the incoming context **before** clearing the active round, so a rejected advance leaves the round intact. The same validation already ran later on the fresh-attempt path, so no legitimate advance is newly rejected — the check is simply moved ahead of the destructive clear. ## Test Adds `rejected_forged_advance_preserves_active_sign_round`: it forges the coordinator on an otherwise-valid advance and asserts the original attempt remains signable. Verified to **fail** against the unfixed code (`ConsumedAttemptReplay`) and **pass** with the fix. ## Scope Gated off in production by `enforce_transitional_signing_disabled_in_production`, so impact is limited to the dev/staging transitional-nonce path. _Found during review of #4005._ 🤖 Generated with [Claude Code](https://claude.com/claude-code)

…ng window (#4112) ## Summary Follow-up to #4005 (FROST/ROAST signer). Two fail-open / DoS fixes in the signing-policy config surface. ### 1. `signer_profile_is_production()` panicked on an unknown profile value On the env-fallback path `TBTC_SIGNER_PROFILE` is unvalidated, so any value other than `production`/`development`/empty hit `panic!`. The panic is caught at the FFI boundary and redacted to a generic internal error, so a single typo (e.g. `staging`) aborted **every** profile-gated operation — a process-wide denial of service until the env var was corrected. Now an unrecognized profile is treated as `production` (fail-closed: the strictest gate set) with a one-time warning, instead of panicking. Production is the safe/strict direction for every caller (provenance gate, bootstrap-dealer-DKG disable, transitional-signing disable, ROAST strict mode, env state-key-provider rejection). The install path (`init_config`) still validates the profile, so this only affects the env-fallback path. ### 2. `load_signing_policy_firewall_config()` accepted a zero-width UTC window `utc_hour_in_window` treats `start == end` as "always in window", so an equal-bounds window silently disabled the time-of-day control (fail-open) rather than restricting it. The loader now rejects equal bounds — and out-of-range (`>= 24`) hours — at load time. ## Tests - `unknown_profile_value_fails_closed_to_production` - `signing_policy_firewall_rejects_equal_utc_window_bounds` _Found during review of #4005._ 🤖 Generated with [Claude Code](https://claude.com/claude-code)

## Summary Follow-up to #4005 (FROST/ROAST signer). Pins cargo dependency resolution in CI and the release build. The `signer-dependency-audit` job runs `cargo-deny` against the committed `Cargo.lock`, but the clippy/test jobs and `build.sh` resolved dependencies **without** `--locked`. A PR that edits `Cargo.toml` so the committed lock no longer satisfies it would let those jobs silently re-resolve to newer, unaudited crate versions (running their build scripts and proc-macros on the runner) while the advisory gate still audits the stale lock — so the security gate and the code actually exercised could diverge. ## Fix Adds `--locked` to the `clippy`, `test`, formal-`test`, and release-`build` invocations so a lock/manifest mismatch fails loudly instead of silently re-resolving. `cargo fmt` is intentionally left unchanged (it does not accept `--locked`). _Found during review of #4005._ 🤖 Generated with [Claude Code](https://claude.com/claude-code)

…he outcome persist_engine_state_to_storage resolved the state-encryption key inside encode_encrypted_state_envelope, i.e. while the caller held the global ENGINE_STATE mutex. For the `command` (KMS/HSM) provider that forks and waits on the key subprocess on every state mutation while the lock is held, so a slow or hung key command stalled every concurrent signer operation for up to the configured timeout (default 30s, max 300s). Split persist into a thin wrapper (cold paths and tests) and persist_engine_state_to_storage_with_key, which takes pre-resolved key material. Every hot per-operation path (run_dkg, start/finalize_sign_round, build_taproot_tx, trigger_emergency_rekey, promote/rollback_canary, refresh_shares, interactive round2/aggregate) now resolves the key with state_encryption_key_material() BEFORE acquiring the lock, so the subprocess never runs under the lock. Only the in-memory serialize + encrypt + atomic write stay under the lock (they must, to preserve write ordering and avoid lost updates between concurrent ops). The resolution outcome is DEFERRED, not propagated eagerly: the key Result is consulted (via require_resolved_state_key) only at a site that actually persists. Idempotent replays and pre-persist rejections return without ever consulting it, so a transient key-provider outage cannot turn a cached read (e.g. re-serving an already-computed signature on a ROAST retry) into a failure. In the interactive round2/aggregate paths, the key is resolved into a local BEFORE the consumption/aggregation marker is inserted, so a key failure returns cleanly (no marker written) rather than escaping the marker-rollback block. Semantics preserved: the key is still resolved fresh on every call (no caching), so external KMS rotation and trigger_emergency_rekey keep picking up the current key. Tests: idempotent_build_tx_replay_survives_state_key_outage (a replay returns its cached tx with a failing key command) and interactive_round2_state_key_failure_does_not_burn_attempt (a Round2 that fails at the key step leaves no consumption marker and the attempt still succeeds on retry). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… not before it #4114 resolved the state-encryption key OFF the ENGINE_STATE lock to keep the command-provider KMS subprocess from stalling the lock. But ENGINE_STATE already serializes the persisted-state WRITES, so resolving the key before the lock left key selection out-of-order with the writes: a caller that resolved an older key (before a provider key rotation) could win the LAST write after a concurrent caller already persisted the rotated key, tagging the single state file with the stale key id. decode_encrypted_state_envelope accepts only the currently-resolved key id, so a restart then rejects the file as a key-id mismatch and the persisted state is unreadable. Resolve the key UNDER the held ENGINE_STATE guard, at the persist site, so key selection is in the same serialized order as the writes -- the last writer always encrypts with the then-current key. Non-marker sites call persist_engine_state_to_storage(&guard) (resolves under the guard then persists); the two interactive marker sites resolve via state_encryption_key_material()? BEFORE inserting the durable marker (preserving fail-closed-before-marker and the marker-rollback-on-persist-failure), then persist_with_key. The deferred require_resolved_state_key helper is removed. Idempotent replays and pre-persist rejections still return before any key resolution, so a transient key-provider outage cannot fail a non-persisting call (the idempotent_build_tx_replay_survives_state_key_outage regression still holds). TRADEOFF: this reverses #4114's off-lock optimization -- under the `command` provider the KMS subprocess now runs under the guard during actual persists (a bounded head-of-line stall; a hung command is bounded by terminate_state_key_command). A future off-lock-preserving design would need a cheap rotation signal from the key provider; recovering that is left as a follow-up. Addresses the Codex P1 review on #4114. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…TATE guard, in write order (#4114) ## Summary Resolves the at-rest state-encryption key **under the held `ENGINE_STATE` guard, at the persist site**, so key selection is ordered with the writes the guard already serializes. This PR began as a perf change (resolve the key *off* the lock to keep the `command`-provider KMS subprocess from stalling `ENGINE_STATE`). A Codex review found that approach introduced a **P1 state-loss bug**, so the PR now lands the correct ordering instead. ## The bug the off-lock approach introduced `ENGINE_STATE` already serializes the persisted-state *writes*, but resolving the key *before* the lock left key selection out-of-order with the writes: a caller that resolved an older key (before a provider key-rotation) could win the **last** write after a concurrent caller already persisted the rotated key, tagging the single state file with the stale key-id. `decode_encrypted_state_envelope` accepts only the currently-resolved key-id, so a restart then rejects the file as a key-id mismatch and the persisted state is **unreadable**. ## Fix Resolve the key under the held guard, at the persist site — the last writer always encrypts with the then-current key. Non-marker sites call `persist_engine_state_to_storage(&guard)` (resolves under the guard, then persists). The two interactive marker sites resolve via `state_encryption_key_material()?` **before** inserting the durable consumed/aggregated marker (preserving fail-closed-before-marker + marker-rollback-on-persist-failure), then `persist_..._with_key`. The deferred `require_resolved_state_key` helper is removed. **Invariants preserved:** idempotent replays / pre-persist rejections still return *before* any key resolution, so a transient key-provider outage cannot fail a non-persisting call (`idempotent_build_tx_replay_survives_state_key_outage` still holds). ## Verification Clean `cargo build` (no warnings); **all 329 crate tests pass**. An independent adversarial review confirmed all four ordering invariants at every converted site (0 correctness findings). ## Tradeoff (tracked in #4121) This reverses the off-lock optimization: under the `command`/KMS provider the subprocess now runs under the guard during actual persists (a bounded head-of-line stall; `env` provider is unaffected). Because the `production` profile mandates the `command` provider, this is the production path. Recovering off-lock safely needs added structure (a cheap provider key-id signal, a coordinated rotation protocol, or a partial-recovery persist-lock) — analyzed and tracked in **#4121**, gated on measuring whether the stall actually bottlenecks throughput. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

Fixes two Codex-flagged P2 state-consistency holes in start_sign_round. 1. Persist failure left in-memory state diverged from durable state. After the fresh-round path mutated the session (consumed-replay markers, round state), a failed persist returned an error but the canonical idempotent serve then returned signature shares WITHOUT persisting -- so a restart could replay the round with no durable consumed marker. The idempotent cached serve now persists before serving when the round is not yet durable, tracked by a process-local SIGN_ROUND_PERSIST_PENDING marker; when the original persist already succeeded it still serves cached without persisting, preserving the 'idempotent replay survives a state-key-provider outage' property build_taproot_tx relies on (rollback is impossible -- the transition clear zeroizes the prior round material). 2. Active attempt cleared before later validation could fail. On an authorized ROAST attempt advance, clear_active_sign_round_for_attempt_transition ran before the fresh-path checks (participant resolution, included-set equality, quarantine, consumed-replay, share construction). A malformed advance that passed authorization but failed a later check destroyed the in-memory active round with no validated/persisted replacement, so the next StartSignRound could start a fresh attempt without transition evidence until a restart. The clear is now deferred until every fallible check has passed, just before the replacement round is installed and persisted. Adds two regression tests, both verified to fail against the pre-fix code. Design validated with Codex. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Codex re-review: the SIGN_ROUND_PERSIST_PENDING marker was process-global but cleared only at start_sign_round's own persist sites. If a start_sign_round persist failed (marker set) and a later UNRELATED successful persist (e.g. a DKG for another session) then wrote the whole engine state -- making that round durable -- the marker stayed stale-true. A subsequent idempotent replay of the now-durable round during a state-key-provider outage would re-enter the persist branch, try to persist again, and fail instead of serving the cached round. Move the marker into the persistence module and clear it inside persist_engine_state_to_storage_with_key on any successful write, so any operation's successful persist clears it. start_sign_round sets it on a fresh-round mutation (mark_sign_round_persist_pending) and reads it in the idempotent serve (sign_round_persist_pending); the explicit clears at the start_sign_round persist sites are removed (the persist clears it now). Adds a regression test (verified to fail against the pre-fix code) covering the unrelated-persist-then-outage replay. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Codex re-review: the process-global SIGN_ROUND_PERSIST_PENDING bit conflated all sessions. After one session's StartSignRound established a round and failed to persist, the bit stayed set process-wide, so the idempotent cached-serve branch -- which consulted it for EVERY session -- would force an UNRELATED, already- durable session's replay to re-persist and fail during the same state-key outage instead of serving its durable cached shares. An availability regression. Replace the global AtomicBool with a per-session set (SIGN_ROUND_PERSIST_PENDING_SESSIONS: OnceLock<Mutex<BTreeSet<String>>>) keyed by session_id: mark the specific session on a fresh-round mutation, consult that session in the cached-serve branch, and clear the WHOLE set on any successful persist (a persist writes the entire engine state, so every in-memory round becomes durable at once -- preserving the cross-operation durability the prior commit established). reset_for_tests clears the set explicitly. Both invariants stay intact: a non-durable round's replay still re-persists before serving (durability); a durable round's replay still serves without persisting (availability); a different session's failed persist no longer drags down an unrelated durable session. Adds a regression test (verified to fail against the global bool) and corrects the marker doc comment (mutual exclusion is the inner mutex, not the ENGINE_STATE guard, since clear also runs off-guard at startup and in test reset). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…#4123) ## Summary Fixes two Codex-flagged P2 state-consistency holes in `start_sign_round` (`pkg/tbtc/signer/src/engine/signing.rs`), where a failed persist or a later validation error could leave in-memory state diverged from durable state and bypass the ROAST replay/transition protections. ### 1. Persist before serving the idempotent cached round (bug #1) The fresh-round path mutates the session (consumed-replay markers + round state) and then persists. If that persist failed (state-key-provider or disk error) it returned an error and served no shares — but the **canonical idempotent serve** then returned signature shares **without persisting**, so a restart could replay the round with no durable consumed marker. Fix: the idempotent cached serve now persists before serving **when the round is not yet durable**, tracked by a process-local `SIGN_ROUND_PERSIST_PENDING` marker (set on fresh-round mutation, cleared after a successful persist). When the original persist already succeeded — the common case — it still serves the cached round **without** persisting, preserving the "idempotent replay survives a state-key-provider outage" property that `build_taproot_tx` relies on. (Rollback is impossible here: the transition clear zeroizes the prior round material.) ### 2. Defer clearing the active attempt until validation finishes (bug #2) On an authorized ROAST attempt advance, `clear_active_sign_round_for_attempt_transition` ran **before** the fresh-path checks (participant resolution, included-set equality, quarantine, consumed-replay, share construction). A malformed advance that passed authorization + the RFC-21 coordinator check but failed a **later** check destroyed the in-memory active round with no validated/persisted replacement — so `active_attempt_context` was dropped and the next `StartSignRound` could start a fresh attempt **without transition evidence** until a restart reloaded durable state. Fix: the advance is authorized but the clear is **deferred** until every fallible fresh-path check has passed, immediately before the replacement round is installed and persisted (the idempotent/conflict branch is skipped for an authorized advance via a flag). ## Tests Two regression tests, both **verified to fail against the pre-fix code**: - `authorized_advance_failing_later_check_preserves_active_sign_round` — an authorized advance that fails the included-set check leaves attempt 1 idempotently signable (pre-fix: `ConsumedAttemptReplay`). - `idempotent_sign_round_replay_persists_before_serving_after_persist_outage` — a sign round whose persist fails does not serve shares on idempotent replay until the state-key provider recovers (pre-fix: served shares with the key down). Full lib suite: **308 passing** (`cargo test --lib`), `cargo fmt --check` clean. Design validated with Codex. _Found during the Codex review of #4005._ Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> 🤖 Generated with [Claude Code](https://claude.com/claude-code)

mswilkison mentioned this pull request May 26, 2026

feat(frost): mirror FROST/Schnorr extraction from tBTC monorepo threshold-network/tbtc-v2#971

Open

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

mswilkison mentioned this pull request May 26, 2026

[DRAFT - decision-gated] feat(tbtc/signer): add TEE hardening checker stack #4007

Draft

mswilkison added a commit that referenced this pull request May 26, 2026

merge: pick up CI fixes from #4005 base

9049f66

mswilkison added a commit that referenced this pull request May 26, 2026

merge: pick up P2TR vector + test path fix from #4005

8403d61

mswilkison added 8 commits May 27, 2026 11:41

fix(tbtc-signer): harden signer validation and retries

220cff2

ci(tbtc-signer): update tla tools checksum

c12c593

ci(tbtc-signer): add full rust checks

2e61c26

fix(tbtc-signer): preserve cached build tx retries

2930091

fix(tbtc-signer): harden production defaults

9cfcde6

fix(tbtc-signer): close hardening follow-ups

c1e72f5

Expose interactive FROST DKG signer ABI

506959d

Support Taproot tweaked signer rounds

2e0a054

mswilkison mentioned this pull request Jun 5, 2026

Support Taproot tweaked signer rounds #4018

Merged

mswilkison added 9 commits June 5, 2026 11:44

Support seeded tbtc-signer DKG

3a259ec

Reuse signer rounds across member identifiers

4c9c654

Harden Taproot signer aggregation

8f5aec7

Preserve legacy signer round fingerprints

a62cb26

Clarify signer exported key boundary

797417f

Stabilize signer round reuse fingerprints

e5b4f16

Classify malformed DKG seeds as validation errors

815ea72

Assert signer round retry idempotency

64d9d64

mswilkison and others added 11 commits June 20, 2026 17:33

mswilkison changed the title ~~feat(tbtc/signer): mirror FROST/ROAST Rust signer from tBTC monorepo~~ feat(tbtc/signer): FROST/ROAST Rust signer Jun 24, 2026

mswilkison and others added 3 commits June 26, 2026 11:19

This was referenced Jun 26, 2026

fix(tbtc/signer): validate incoming attempt context before clearing active round #4111

Merged

fix(tbtc/signer): fail closed on unknown profile and degenerate signing window #4112

Merged

ci(tbtc/signer): pin cargo dependency resolution with --locked #4113

Merged

mswilkison and others added 4 commits June 26, 2026 12:11

mswilkison mentioned this pull request Jun 26, 2026

fix(tbtc/signer): resolve the state-encryption key under the ENGINE_STATE guard, in write order #4114

Merged

mswilkison and others added 3 commits June 27, 2026 17:19

mswilkison mentioned this pull request Jun 28, 2026

fix(signer): defer sign-round clear + persist before idempotent serve #4123

Merged

mswilkison and others added 3 commits June 28, 2026 10:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(tbtc/signer): FROST/ROAST Rust signer#4005

feat(tbtc/signer): FROST/ROAST Rust signer#4005
mswilkison wants to merge 166 commits into
mainfrom
extraction/frost-signer-mirror-2026-05-26

mswilkison commented May 26, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 26, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 26, 2026

Uh oh!

coderabbitai Bot May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

mswilkison commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Source-PR Provenance

Why This Lives In keep-core

Layout Introduced

CI Coverage

Verification

Relationship To PR #3866

Plan Context

Uh oh!

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mswilkison commented May 26, 2026 •

edited

Loading

coderabbitai Bot commented May 26, 2026 •

edited

Loading