feat(tbtc/signer): FROST/ROAST Rust signer#4005
Conversation
Lands the Rust signer at pkg/tbtc/signer/ alongside the existing Go DKG coordinator. Mirrors the signer slice of tlabs-xyz/tbtc:feat/frost-schnorr-migration (PR #10) at frozen tag frost-extraction-source-v1. Per extraction plan v38 §3.1, the signer co-locates with keep-core because: (a) HSM enforcement is external to signer code (standard PKCS#11/KMIP client), doesn't force a separate audit lifecycle; (b) coordinator coupling dominates (B-2 DKG coordinator already lives in this repo via PR #3866); (c) TEE adoption later is reversible without forcing a repo split. Layout - pkg/tbtc/signer/ — Rust crate with own Cargo.toml - pkg/tbtc/signer/docs/ — signer + ROAST + TEE specs - pkg/tbtc/signer/docs/formal/models/ — ROAST + TEE TLA+ models - pkg/tbtc/signer/scripts/formal/ — ROAST vector + TLA runner - pkg/tbtc/signer/test/vectors/ — roast-attempt-context-v1.json Files (49 total) - 47 mirror status (Rust source, signer docs, ROAST docs, TLA models, test vector, etc.) - 2 allowlisted-divergence: - pkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjs (path normalization to signer-repo paths) - pkg/tbtc/signer/scripts/formal/run_tla_models.sh (MODELS_PATH env var refactor; default pkg/tbtc/signer/docs/formal/models/) Provenance - Source repository: tlabs-xyz/tbtc - Source branch: feat/frost-schnorr-migration - Source tag (frozen): frost-extraction-source-v1 - Source commit (H): 52389bd5cccb5daeef195671feb7ca46be6e2f37 - Source manifest: extraction/frost-extraction-source-manifest.json (manifestSha256: f7295fb738104501eb6c0c2447a42122ceb5f684c7a7c5dfb50ecb0bde3a0ea0) - Source PR(s): #425 (tbtc-signer error codes) + the full FROST migration series; complete list per source manifest's commit range over tools/tbtc-signer/ and signer-adjacent docs/scripts. Build wiring (follow-up) This PR lands the signer source code, docs, vectors, and scripts. Rust toolchain CI integration (cargo build/test/clippy/fmt as a separate CI job alongside the existing Go jobs) is a follow-up — Go workspace ignores non-Go subdirectories automatically; the cargo crate at pkg/tbtc/signer/Cargo.toml is independently buildable. Known TBD (resolved pre-merge) - 2 allowlisted-divergence files have expectedTargetSha256 = <TBD> in the source manifest. Path normalization transformations need to be applied to make the scripts run in pkg/tbtc/signer/ context; this PR ships the source verbatim, follow-up commits on this branch apply the transformations before merge. - Dual signoff required on each allowlisted-divergence entry per plan v38 §4.2 (extraction lead + canonical repo maintainer). Verification (pre-merge per plan v38 §7.2) - 47 mirror files: sha256 equality between git show frost-extraction-source-v1:<sourcePath> and this PR's content at pkg/tbtc/signer/<targetPath>. - 2 allowlisted-divergence files: sha256 == expectedTargetSha256 (recorded post-transformation in source manifest). - PR-scoped rogue-file check: git diff --name-only main...HEAD only contains files in manifest's fileMap with targetKey "signer". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR introduces ChangesCore Signer Implementation
Admission Checking & Policy Enforcement
Benchmarking & Formal Test Vectors
TLA+ Formal Verification Models
Design Documentation & Specifications
Build Configuration & Scripts
🎯 4 (Complex) | ⏱️ ~75 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…ripts
Per extraction plan v38 §4.4, allowlisted-divergence files require
content normalization for the canonical context. This commit applies
the transformations declared in the source manifest for the 2 signer-
side allowlisted-divergence entries.
Transformations
- pkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjs:
Rewrite vector path from
`docs/frost-migration/test-vectors/roast-attempt-context-v1.json`
to canonical signer layout
`test/vectors/roast-attempt-context-v1.json`
(both relative to rootDir which is two levels up from the script
location; in canonical context that's pkg/tbtc/signer/)
- pkg/tbtc/signer/scripts/formal/run_tla_models.sh:
Rewrite MODEL_DIR default from
`$ROOT_DIR/docs/frost-migration/formal-verification/models`
to canonical signer layout
`$ROOT_DIR/docs/formal/models`
Plus MODELS_PATH env-var override for alternate environments (CI
matrices, local dev trees). ROOT_DIR is unchanged
(`$(dirname $BASH_SOURCE)/../..` resolves to pkg/tbtc/signer/ here).
Verification
- Both files retain identical behavior to their monorepo counterparts
when invoked from canonical signer layout
- Comments added documenting the path normalization with reference back
to the source manifest's allowlisted-divergence status
Recompute expectedTargetSha256 for both entries in the source manifest
and collect dual signoff before merge per plan v38 §4.2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 13
🧹 Nitpick comments (3)
pkg/tbtc/signer/src/api.rs (1)
196-202: 💤 Low valueMissing
#[serde(default, skip_serializing_if)]onscript_tree_hex.All other
Option<T>fields in this file use#[serde(default, skip_serializing_if = "Option::is_none")], butscript_tree_hexdoes not. This inconsistency means the field will serialize asnullwhen absent, rather than being omitted.Suggested fix
pub struct BuildTaprootTxRequest { pub session_id: String, pub inputs: Vec<TxInput>, pub outputs: Vec<TxOutput>, + #[serde(default, skip_serializing_if = "Option::is_none")] pub script_tree_hex: Option<String>, }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/tbtc/signer/src/api.rs` around lines 196 - 202, The BuildTaprootTxRequest struct's script_tree_hex Option field lacks the serde attributes used elsewhere; update the declaration of BuildTaprootTxRequest so the script_tree_hex field is annotated with #[serde(default, skip_serializing_if = "Option::is_none")] to match other Option<T> fields (preserving Clone/Debug/Deserialize/Serialize behavior) so it is omitted from serialized output when None instead of serializing as null.pkg/tbtc/signer/src/lib.rs (1)
391-421: 💤 Low value
std::env::set_varandstd::env::remove_varare not thread-safe.These functions are unsound in multi-threaded contexts and deprecated since Rust 1.66. While the tests appear to serialize access via
lock_test_state(), this guard must be held across all env mutations and checks within a test to prevent races with parallel test threads.Current usage appears safe given the locking pattern, but this is fragile. Consider using a dedicated test configuration mechanism that doesn't rely on process-wide environment mutation.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/tbtc/signer/src/lib.rs` around lines 391 - 421, EnvVarGuard's methods (EnvVarGuard::set, EnvVarGuard::unset) and its Drop rely on std::env::set_var/remove_var which are process-wide and not thread-safe; replace this pattern with a test-scoped, non-global solution such as using a crate that provides scoped environment variables (e.g., temp_env or similar) or refactor tests to accept an injected configuration object instead of mutating process env; update usages to acquire and hold the new scoped guard for the entire duration of tests that need env changes (or pass a Config struct into functions under test) and remove direct calls to std::env::set_var/remove_var and the EnvVarGuard Drop behavior to avoid races.pkg/tbtc/signer/src/bin/admission_checker.rs (1)
257-276: 💤 Low valueConsider cleaning up the temp file if rename fails.
If
fs::renamefails (e.g., cross-filesystem move or permissions issue), the temp file remains on disk. Adding a cleanup attempt in the error path would improve robustness.♻️ Proposed cleanup on error
fs::write(&tmp_path, serialized).map_err(|error| { format!( "failed to write override replay registry temp file [{}]: {error}", tmp_path.display() ) })?; - fs::rename(&tmp_path, path).map_err(|error| { - format!( - "failed to persist override replay registry [{}]: {error}", - path.display() - ) - }) + fs::rename(&tmp_path, path).map_err(|error| { + let _ = fs::remove_file(&tmp_path); // Best-effort cleanup + format!( + "failed to persist override replay registry [{}]: {error}", + path.display() + ) + }) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/tbtc/signer/src/bin/admission_checker.rs` around lines 257 - 276, persist_override_replay_registry currently leaves the temporary file (tmp_path) if fs::rename fails; modify the rename error path to attempt cleanup of tmp_path before returning the error. Specifically, call fs::remove_file(&tmp_path) (ignoring or logging its result) inside the Err branch that handles the rename failure so the function still returns the original formatted error for fs::rename but also tries to remove the leftover tmp file; reference persist_override_replay_registry, path, tmp_path, and fs::rename when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/tbtc/signer/docs/formal/models/README.md`:
- Around line 32-51: Update the incorrect repository paths in the traceability
matrix of pkg/tbtc/signer/docs/formal/models/README.md so links point to the
actual implementation and docs in this repo: change references to
tools/tbtc-signer/src/engine.rs to pkg/tbtc/signer/src/engine.rs for the entries
mentioning RoastAttemptStateMachine.tla (validate_attempt_context, replay
guards) and StateKeyProviderPolicy.tla (decode_encrypted_state_envelope,
encode_encrypted_state_envelope); change
docs/frost-migration/tee-whitelisted-signer-enforcement-plan.md to
pkg/tbtc/signer/docs/tee-whitelisted-signer-enforcement-plan.md for
TeeEnforcementModes.tla; and change
docs/frost-migration/roast-phase-5-security-rollout-gates.md to
pkg/tbtc/signer/docs/roast-phase-5-security-rollout-gates.md for
RoastRolloutPolicy.tla so readers can locate the referenced code and policy
docs.
In `@pkg/tbtc/signer/docs/roast-implementation-plan.md`:
- Line 93: Update the broken doc links in
pkg/tbtc/signer/docs/roast-implementation-plan.md by replacing repo-external
paths like `docs/frost-migration/roast-phase-0-spec-freeze.md` and any
`tools/tbtc-signer/...` references with the correct repo-local paths under
pkg/tbtc/signer/docs (or use correct relative paths from this markdown file);
search the file for all occurrences of `docs/frost-migration/...` and
`tools/tbtc-signer/...` (including the instances similar to the shown
`docs/frost-migration/roast-phase-0-spec-freeze.md`) and normalize each link so
it points to the new location in this package, preserving anchor fragments and
updating any link text as needed.
In `@pkg/tbtc/signer/docs/roast-phase-5-rollout-runbook.md`:
- Around line 26-29: Update the incorrect crate path used in the runbook
commands: replace occurrences of "cd tools/tbtc-signer" with "cd
pkg/tbtc/signer" for the benchmark command (`cargo bench --features
bench-restart-hook --bench phase5_roast`) and the chaos suite script invocation
(`./scripts/run_phase5_chaos_suite.sh`) so the commands run from the correct
crate directory.
In `@pkg/tbtc/signer/docs/roast-phase-5-security-rollout-gates.md`:
- Around line 92-99: Update stale repository paths in
roast-phase-5-security-rollout-gates.md: replace any occurrences of the old
tooling path "tools/tbtc-signer" with the new location "pkg/tbtc/signer" (e.g.,
update the run command `cd tools/tbtc-signer && cargo bench --features
bench-restart-hook --bench phase5_roast` to `cd pkg/tbtc/signer ...`), and
update any links referencing
`docs/frost-migration/roast-phase-5-baseline-calibration.md` to the document’s
new location in the repo (search for the exact link text and swap to the correct
path). Ensure all instances at the reported locations (around the run command
and the listed links) are changed consistently so operators following the
runbook hit the correct files and commands.
In `@pkg/tbtc/signer/docs/rust-rewrite-bootstrap.md`:
- Around line 10-13: The documentation still references the old crate path
"tools/tbtc-signer" and the validation command using that path; update every
occurrence to "pkg/tbtc/signer" in rust-rewrite-bootstrap.md (including the
header lines that list the crate, the C ABI include path `include/frost_tbtc.h`,
and any validation/build commands) so links and commands point to the colocated
pkg/tbtc/signer location; search for "tools/tbtc-signer" and replace with
"pkg/tbtc/signer" and verify the validation command and any examples reference
the new path.
In `@pkg/tbtc/signer/docs/signer-api-contract-decision-brief.md`:
- Around line 43-47: Update the two referenced paths in
signer-api-contract-decision-brief.md so they point to the mirrored crate
locations: replace `docs/frost-migration/rust-rewrite-bootstrap.md` with
`pkg/tbtc/signer/docs/frost-migration/rust-rewrite-bootstrap.md` and replace
`tools/tbtc-signer/src/lib.rs` with `pkg/tbtc/signer/src/lib.rs` (look for the
occurrences shown around the paragraph mentioning the bootstrap Rust crate and
file: `tools/tbtc-signer/src/lib.rs` and update those strings accordingly).
In `@pkg/tbtc/signer/docs/tbtc-signer-secret-material-hardening-plan.md`:
- Around line 6-7: The doc still uses the old crate path string
`tools/tbtc-signer`; update that scope reference to `pkg/tbtc/signer` throughout
the file (tbtc-signer-secret-material-hardening-plan.md) so the plan points to
the mirrored crate location, and scan for any other occurrences of
`tools/tbtc-signer` in this document to replace with `pkg/tbtc/signer`.
In `@pkg/tbtc/signer/docs/tee-whitelisted-signer-enforcement-plan.md`:
- Around line 296-298: Update the two broken cross-doc links in
pkg/tbtc/signer/docs/tee-whitelisted-signer-enforcement-plan.md (currently
referencing docs/frost-migration/roast-phase-5-security-rollout-gates.md and
docs/frost-migration/roast-phase-5-rollout-runbook.md on lines ~296–297) to
point to their correct locations inside this PR:
pkg/tbtc/signer/docs/roast-phase-5-security-rollout-gates.md and
pkg/tbtc/signer/docs/roast-phase-5-rollout-runbook.md so cross-document
navigation resolves correctly.
In `@pkg/tbtc/signer/README.md`:
- Around line 53-54: Update the README.md occurrence(s) that reference the old
path string "tools/tbtc-signer" to the new crate location "pkg/tbtc/signer":
search for and replace that path in all command snippets, code blocks, and file
references (e.g., cargo build/cd commands and any path bullets) so every
instance uses "pkg/tbtc/signer" consistently; ensure both shell commands and
prose file paths are updated, and run a quick grep for "tools/tbtc-signer" to
confirm no remaining references.
In `@pkg/tbtc/signer/scripts/admission-policy-v1.sample.json`:
- Line 8: Replace the non-hex placeholder value for the JSON key
dao_override_trust_root_pubkey_hex with a syntactically valid 32-byte hex string
(64 lowercase hex characters) so sample files and copy/paste validation don't
break; update the sample value to something like a 64-character hex placeholder
and leave replacement guidance in the README or nearby comment explaining it
must be replaced with the real x-only pubkey hex.
In `@pkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjs`:
- Around line 15-18: vectorsPath currently resolves to
"docs/frost-migration/test-vectors/roast-attempt-context-v1.json" and will fail
because the vectors live under "test/vectors"; update the path.join call that
constructs vectorsPath (using rootDir and the filename) to point to
"test/vectors/roast-attempt-context-v1.json" so the script loads the correct
file; ensure the change is made where vectorsPath is declared and used in this
module.
In `@pkg/tbtc/signer/scripts/formal/run_tla_models.sh`:
- Around line 4-5: The MODEL_DIR assignment in run_tla_models.sh is pointing to
the wrong path; update the MODEL_DIR variable (currently computed relative to
ROOT_DIR) to "$ROOT_DIR/docs/formal/models" so the directory existence check in
the script (around the directory existence test near line 20) succeeds; modify
the MODEL_DIR definition in the script (look for the MODEL_DIR variable
assignment and references) to use the corrected path and ensure any subsequent
uses of MODEL_DIR still reference this updated variable.
In `@pkg/tbtc/signer/tests/p2tr_signature_fraud_vectors.rs`:
- Around line 375-376: The test constructs vectors_path using the vectors_path
variable in pkg/tbtc/signer/tests/p2tr_signature_fraud_vectors.rs which
currently joins
"../../docs/frost-migration/test-vectors/p2tr-signature-fraud-v0.json" relative
to CARGO_MANIFEST_DIR and points to a non-existent file; fix by either adding
the missing p2tr-signature-fraud-v0.json to the repo at that path or update the
vectors_path join to the correct relative path where the JSON actually lives
(adjust the "../" segments or point to the canonical test-vectors location),
ensuring the variable name vectors_path and its usage remain unchanged.
---
Nitpick comments:
In `@pkg/tbtc/signer/src/api.rs`:
- Around line 196-202: The BuildTaprootTxRequest struct's script_tree_hex Option
field lacks the serde attributes used elsewhere; update the declaration of
BuildTaprootTxRequest so the script_tree_hex field is annotated with
#[serde(default, skip_serializing_if = "Option::is_none")] to match other
Option<T> fields (preserving Clone/Debug/Deserialize/Serialize behavior) so it
is omitted from serialized output when None instead of serializing as null.
In `@pkg/tbtc/signer/src/bin/admission_checker.rs`:
- Around line 257-276: persist_override_replay_registry currently leaves the
temporary file (tmp_path) if fs::rename fails; modify the rename error path to
attempt cleanup of tmp_path before returning the error. Specifically, call
fs::remove_file(&tmp_path) (ignoring or logging its result) inside the Err
branch that handles the rename failure so the function still returns the
original formatted error for fs::rename but also tries to remove the leftover
tmp file; reference persist_override_replay_registry, path, tmp_path, and
fs::rename when making the change.
In `@pkg/tbtc/signer/src/lib.rs`:
- Around line 391-421: EnvVarGuard's methods (EnvVarGuard::set,
EnvVarGuard::unset) and its Drop rely on std::env::set_var/remove_var which are
process-wide and not thread-safe; replace this pattern with a test-scoped,
non-global solution such as using a crate that provides scoped environment
variables (e.g., temp_env or similar) or refactor tests to accept an injected
configuration object instead of mutating process env; update usages to acquire
and hold the new scoped guard for the entire duration of tests that need env
changes (or pass a Config struct into functions under test) and remove direct
calls to std::env::set_var/remove_var and the EnvVarGuard Drop behavior to avoid
races.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: c0224eef-499a-42a2-a346-f06ef353278b
⛔ Files ignored due to path filters (1)
pkg/tbtc/signer/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (48)
pkg/tbtc/signer/.gitignorepkg/tbtc/signer/Cargo.tomlpkg/tbtc/signer/README.mdpkg/tbtc/signer/benches/phase5_roast.rspkg/tbtc/signer/build.shpkg/tbtc/signer/docs/formal/models/README.mdpkg/tbtc/signer/docs/formal/models/RoastAttemptStateMachine.cfgpkg/tbtc/signer/docs/formal/models/RoastAttemptStateMachine.tlapkg/tbtc/signer/docs/formal/models/RoastRolloutPolicy.cfgpkg/tbtc/signer/docs/formal/models/RoastRolloutPolicy.tlapkg/tbtc/signer/docs/formal/models/StateKeyProviderPolicy.cfgpkg/tbtc/signer/docs/formal/models/StateKeyProviderPolicy.production.cfgpkg/tbtc/signer/docs/formal/models/StateKeyProviderPolicy.tlapkg/tbtc/signer/docs/formal/models/TeeEnforcementModes.cfgpkg/tbtc/signer/docs/formal/models/TeeEnforcementModes.tlapkg/tbtc/signer/docs/permissioned-signer-hardening-rfc.mdpkg/tbtc/signer/docs/roast-implementation-plan.mdpkg/tbtc/signer/docs/roast-phase-0-spec-freeze.mdpkg/tbtc/signer/docs/roast-phase-1.5-consumed-registry-integration.mdpkg/tbtc/signer/docs/roast-phase-2-coordinator-policy-enforcement.mdpkg/tbtc/signer/docs/roast-phase-3-attempt-transcript-replay-hardening.mdpkg/tbtc/signer/docs/roast-phase-4-liveness-policy-recovery.mdpkg/tbtc/signer/docs/roast-phase-5-baseline-calibration.mdpkg/tbtc/signer/docs/roast-phase-5-rollout-runbook.mdpkg/tbtc/signer/docs/roast-phase-5-security-rollout-gates.mdpkg/tbtc/signer/docs/rust-rewrite-bootstrap.mdpkg/tbtc/signer/docs/signer-api-contract-decision-brief.mdpkg/tbtc/signer/docs/tbtc-signer-secret-material-hardening-plan.mdpkg/tbtc/signer/docs/tee-whitelisted-signer-enforcement-plan.mdpkg/tbtc/signer/docs/true-late-t-of-n-finalize-considerations.mdpkg/tbtc/signer/include/frost_tbtc.hpkg/tbtc/signer/scripts/admission-candidate.sample.jsonpkg/tbtc/signer/scripts/admission-existing.sample.jsonpkg/tbtc/signer/scripts/admission-override-registry.sample.jsonpkg/tbtc/signer/scripts/admission-override.sample.jsonpkg/tbtc/signer/scripts/admission-policy-v1.sample.jsonpkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjspkg/tbtc/signer/scripts/formal/run_tla_models.shpkg/tbtc/signer/scripts/run_phase5_chaos_suite.shpkg/tbtc/signer/src/api.rspkg/tbtc/signer/src/bin/admission_checker.rspkg/tbtc/signer/src/engine.rspkg/tbtc/signer/src/errors.rspkg/tbtc/signer/src/ffi.rspkg/tbtc/signer/src/go_math_rand.rspkg/tbtc/signer/src/lib.rspkg/tbtc/signer/test/vectors/roast-attempt-context-v1.jsonpkg/tbtc/signer/tests/p2tr_signature_fraud_vectors.rs
Adds a focused workflow that runs the Rust signer's formal-invariant test suite + TLA model checks. Moved from threshold-network/tbtc-v2/.github/workflows/ci-formal-verification.yml (jobs `signer-formal-invariants` + `tla-model-checks`) per extraction plan v38 §3.1 — the signer code lives here at pkg/tbtc/signer/, not in tbtc-v2, so the CI jobs that exercise it belong here too. Jobs - signer-formal-invariants: cargo test --manifest-path pkg/tbtc/signer/ Cargo.toml formal_verification_ (filter to formal-only cases) - tla-model-checks: pkg/tbtc/signer/scripts/formal/run_tla_models.sh (iterates over .cfg files in pkg/tbtc/signer/docs/formal/models/ and runs TLC against each; MODELS_PATH env var allows override per the path-normalization commit b84b574c on this branch) Triggers - pull_request on pkg/tbtc/signer/** changes + this workflow file - schedule nightly at 05:23 UTC (mirrors monorepo's pattern of running formal invariants both on PRs and nightly) - workflow_dispatch for manual runs Related changes in companion PR threshold-network/tbtc-v2#971: - Removed these jobs from canonical tbtc-v2's ci-formal-verification.yml - Added a comment in that file pointing here Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/tbtc-signer-formal.yml:
- Around line 25-26: The checkout steps using actions/checkout@v4 persist git
credentials by default; update each Checkout step (the uses: actions/checkout@v4
entries) to add with: persist-credentials: false so credentials are not stored
in the runner after checkout. Ensure both occurrences of actions/checkout@v4 in
the workflow are modified accordingly.
- Line 26: Update the GitHub Actions workflow to pin action versions and harden
checkout credentials: replace occurrences of actions/checkout@v4 with
actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 and add with:
persist-credentials: false to both checkout steps that use checkout, replace
dtolnay/rust-toolchain@stable with
dtolnay/rust-toolchain@29eef336d9b2848a0b548edc03f92a220660cdb8, and replace
actions/setup-java@v4 with
actions/setup-java@c1e323688fd81a25caa38c78aa6df2d33d3e20d9 so the workflow pins
SHA-based commits and disables persisting credentials on checkout.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 88553e79-6db4-4e95-98fe-00c56e1bc640
📒 Files selected for processing (1)
.github/workflows/tbtc-signer-formal.yml
| - name: Checkout | ||
| uses: actions/checkout@v4 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE=".github/workflows/tbtc-signer-formal.yml"
echo "== File exists? =="
ls -l "$FILE"
echo "== Lines 1-120 with numbers =="
nl -ba "$FILE" | sed -n '1,120p'Repository: threshold-network/keep-core
Length of output: 272
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE=".github/workflows/tbtc-signer-formal.yml"
echo "== File exists? =="
ls -l "$FILE"
echo "== Lines 1-120 with numbers =="
nl -ba "$FILE" | sed -n '1,120p'Repository: threshold-network/keep-core
Length of output: 272
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE=".github/workflows/tbtc-signer-formal.yml"
echo "== Showing lines 1-80 =="
awk 'NR>=1 && NR<=80 {printf "%4d: %s\n", NR, $0}' "$FILE"
echo "== Showing checkout-related lines (search) =="
rg -n "name:\s*Checkout|uses:\s*actions/checkout|persist-credentials" "$FILE" || trueRepository: threshold-network/keep-core
Length of output: 2302
Set persist-credentials: false on both actions/checkout@v4 steps
The current workflow checks out without disabling persisted git credentials; add with: persist-credentials: false to reduce credential exposure.
Suggested patch
- name: Checkout
uses: actions/checkout@v4
+ with:
+ persist-credentials: false
@@
- name: Checkout
uses: actions/checkout@v4
+ with:
+ persist-credentials: false📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Checkout | |
| uses: actions/checkout@v4 | |
| - name: Checkout | |
| uses: actions/checkout@v4 | |
| with: | |
| persist-credentials: false |
🧰 Tools
🪛 zizmor (1.25.2)
[warning] 25-26: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false
(artipacked)
[error] 26-26: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/tbtc-signer-formal.yml around lines 25 - 26, The checkout
steps using actions/checkout@v4 persist git credentials by default; update each
Checkout step (the uses: actions/checkout@v4 entries) to add with:
persist-credentials: false so credentials are not stored in the runner after
checkout. Ensure both occurrences of actions/checkout@v4 in the workflow are
modified accordingly.
| timeout-minutes: 30 | ||
| steps: | ||
| - name: Checkout | ||
| uses: actions/checkout@v4 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
for repo in actions/checkout actions/setup-java dtolnay/rust-toolchain; do
echo "=== $repo ==="
git ls-remote "https://github.com/${repo}.git" | head -n 5
echo
doneRepository: threshold-network/keep-core
Length of output: 535
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Inspect workflow around the referenced lines
echo "=== Workflow file context ==="
nl -ba .github/workflows/tbtc-signer-formal.yml | sed -n '1,120p'
# 2) Resolve refs to commit SHAs (avoid SIGPIPE by capturing then slicing)
resolve_ref() {
local repo="$1" ref="$2"
local url="https://github.com/${repo}.git"
local tmp="/tmp/lsremote_${repo//\//_}_${ref//\//_}"
git ls-remote "$url" "$ref" > "$tmp" 2>/dev/null || true
echo "--- ${repo} ${ref} ---"
cat "$tmp" || true
echo
rm -f "$tmp" || true
}
echo "=== Resolve action refs to SHAs ==="
resolve_ref "actions/checkout" "refs/tags/v4"
resolve_ref "actions/checkout" "refs/heads/v4"
resolve_ref "actions/setup-java" "refs/tags/v4"
resolve_ref "actions/setup-java" "refs/heads/v4"
resolve_ref "dtolnay/rust-toolchain" "refs/heads/stable"
resolve_ref "dtolnay/rust-toolchain" "refs/tags/stable"Repository: threshold-network/keep-core
Length of output: 162
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Workflow file context (numbered) ==="
python3 - <<'PY'
from pathlib import Path
p = Path(".github/workflows/tbtc-signer-formal.yml")
lines = p.read_text().splitlines()
start, end = 1, min(len(lines), 140)
for i in range(start, end+1):
print(f"{i}\t{lines[i-1]}")
PY
echo
echo "=== Resolve action refs to commit SHAs ==="
resolve_ref() {
local repo="$1" ref="$2"
local url="https://github.com/${repo}.git"
echo "--- ${repo} ${ref} ---"
git ls-remote "$url" "$ref" || true
echo
}
resolve_ref "actions/checkout" "refs/tags/v4"
resolve_ref "actions/checkout" "refs/heads/v4"
resolve_ref "actions/setup-java" "refs/tags/v4"
resolve_ref "actions/setup-java" "refs/heads/v4"
resolve_ref "dtolnay/rust-toolchain" "refs/heads/stable"
resolve_ref "dtolnay/rust-toolchain" "refs/tags/stable"Repository: threshold-network/keep-core
Length of output: 2438
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Total lines: $(python3 - <<'PY'\nimport pathlib\np=pathlib.Path('.github/workflows/tbtc-signer-formal.yml')\nprint(len(p.read_text().splitlines()))\nPY)"
echo
echo "=== uses: occurrences ==="
rg -n "uses:" .github/workflows/tbtc-signer-formal.yml || true
echo
echo "=== persist-credentials occurrences ==="
rg -n "persist-credentials" .github/workflows/tbtc-signer-formal.yml || trueRepository: threshold-network/keep-core
Length of output: 133
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Total lines: $(python3 - <<'PY'
import pathlib
p=pathlib.Path('.github/workflows/tbtc-signer-formal.yml')
print(len(p.read_text().splitlines()))
PY)"
echo
echo "=== uses: occurrences ==="
rg -n "uses:" .github/workflows/tbtc-signer-formal.yml || true
echo
echo "=== persist-credentials occurrences ==="
rg -n "persist-credentials" .github/workflows/tbtc-signer-formal.yml || trueRepository: threshold-network/keep-core
Length of output: 424
Pin GitHub Actions to commit SHAs and harden checkout credentials.
actions/checkout@v4(lines 26, 44) ->actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5; also addwith: persist-credentials: falseto both checkout steps.dtolnay/rust-toolchain@stable(line 29) ->dtolnay/rust-toolchain@29eef336d9b2848a0b548edc03f92a220660cdb8.actions/setup-java@v4(line 47) ->actions/setup-java@c1e323688fd81a25caa38c78aa6df2d33d3e20d9.
🧰 Tools
🪛 zizmor (1.25.2)
[warning] 25-26: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false
(artipacked)
[error] 26-26: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/tbtc-signer-formal.yml at line 26, Update the GitHub
Actions workflow to pin action versions and harden checkout credentials: replace
occurrences of actions/checkout@v4 with
actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 and add with:
persist-credentials: false to both checkout steps that use checkout, replace
dtolnay/rust-toolchain@stable with
dtolnay/rust-toolchain@29eef336d9b2848a0b548edc03f92a220660cdb8, and replace
actions/setup-java@v4 with
actions/setup-java@c1e323688fd81a25caa38c78aa6df2d33d3e20d9 so the workflow pins
SHA-based commits and disables persisting credentials on checkout.
Resolves CI failures on PR #4005 (signer mirror): 1. TLA model checks: run_tla_models.sh lacked executable bit at canonical HEAD. CI ran the script directly (no `bash` prefix), which fails with `Permission denied`. Fixed via `git update-index --chmod=+x`. 2. Signer formal invariants: engine.rs's formal_verification_roast_attempt_context_shared_vectors_match_ expected_values test referenced vectors at a path stale from the umbrella's docs/frost-migration/test-vectors/ layout. The manifest places the vector at the canonical-signer test/vectors/ subdir (pkg/tbtc/signer/test/vectors/roast-attempt-context-v1.json per the source-to-target map). Updated the `PathBuf::from(env!("CARGO_MANIFEST_DIR")).join(...)` argument from `../../docs/frost-migration/test-vectors/roast-attempt-context-v1.json` (umbrella-relative) to `test/vectors/roast-attempt-context-v1.json` (signer-CARGO_MANIFEST_DIR-relative, where the vector actually lives at canonical HEAD). Verified locally: - ls -l shows executable bit set on run_tla_models.sh - engine.rs path now resolves to the correct mirror location - Vector exists at pkg/tbtc/signer/test/vectors/roast-attempt-context-v1.json Same fix needs to be applied to PR #4007 (stacked on #4005) in a follow-up commit on its branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #4005 signer formal invariants test formal_verification_p2tr_signature_fraud_vectors_match_bitcoin_crate was failing because: 1. The umbrella source manifest only mapped p2tr-signature-fraud-v0.json to the tbtc-v2 target (docs/test-vectors/). It was not mirrored to the keep-core (signer) target. 2. tests/p2tr_signature_fraud_vectors.rs referenced the vector at the umbrella-relative path `../../docs/frost-migration/test-vectors/p2tr-signature-fraud-v0.json` (CARGO_MANIFEST_DIR-relative -> repo-root + docs/frost-migration/...). That directory does not exist on canonical keep-core. This is a structural omission in the manifest, not a divergence: the cross-language vector test exists in both Solidity (tbtc-v2 side) and Rust (signer side), and the vector is genuinely needed in both places to verify cross-implementation consistency. Fix - Mirror p2tr-signature-fraud-v0.json (598 lines, byte-identical content from tbtc-v2 mirror at docs/test-vectors/) to pkg/tbtc/signer/test/vectors/p2tr-signature-fraud-v0.json. - Update the test path in tests/p2tr_signature_fraud_vectors.rs from `../../docs/frost-migration/test-vectors/p2tr-signature-fraud-v0.json` to `test/vectors/p2tr-signature-fraud-v0.json` (CARGO_MANIFEST_DIR = pkg/tbtc/signer/, so test/vectors/... resolves correctly). Two stale comment references remain in - pkg/tbtc/signer/docs/roast-implementation-plan.md:265 - pkg/tbtc/signer/scripts/formal/check_roast_attempt_context_vectors.mjs:19 Both are comment-only doc pointers to the source layout; they do not affect runtime. Left as-is to preserve the umbrella -> canonical provenance trail. Manifest update follows in a stacked PR on tlabs-xyz/tbtc#10: - Add p2tr-signature-fraud-v0.json -> signer target mapping (test/vectors/p2tr-signature-fraud-v0.json). - Reclassify tests/p2tr_signature_fraud_vectors.rs from mirror to allowlisted-divergence (path differs from umbrella). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stacked on extraction/frost-signer-mirror-2026-05-26 / PR #4005. Adds optional taproot_merkle_root_hex to start/finalize signing rounds, binds it into request fingerprints and round IDs, signs and aggregates with frost-secp256k1-tr Taproot tweaks, and verifies tweaked aggregates in tests. Verification: cargo test in pkg/tbtc/signer.
…ge (re-review)
Codex re-review on the Round2 completion gate:
[P1] The gate trusted aggregated_interactive_attempt_markers keyed by attempt_id
alone, but interactive_aggregate never binds attempt_id to the aggregated package/
message - it only checks the shares aggregate under the session key. So a caller
with any valid aggregate could replay it under a DIFFERENT live attempt's id, set
that marker, and the new gate would then refuse the real Round2 before its nonce is
consumed (an availability/DoS regression the single-member engine never had).
Fix: make the completion marker MESSAGE-BOUND - interactive_aggregated_marker =
"{attempt_id}@{message_digest}". interactive_aggregate writes it from the package it
actually aggregated (all five marker sites: both completion pre-checks, the capacity
guard, the insert, and the persist-rollback). The Round2 gate recomputes it from the
message THIS member opened, so it preempts Round2 only for a genuine same-message
completion; a replayed aggregate carrying a different message sets an unrelated
marker that cannot match. Same-message re-aggregation is still rejected (existing
restart/repeat tests pass unchanged - they re-aggregate the same message).
[P2] On that genuine completion rejection, the unsigned sibling's entry is now dead,
so free it (zeroizing its nonces) instead of letting it hold a live-member cap slot
until the TTL sweep. Only the matching member's entry is removed.
Tests: the multi-seat refusal test now also asserts the marker is bound
(attempt_id@digest, not the bare id) and that the refused sibling's entry is freed.
All 294 lib tests pass; cargo fmt clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…review) Codex re-review (P2, valid): the previous commit made the aggregate completion marker message-bound (attempt_id@digest), but a completion persisted by the pre-binding engine is stored as the BARE attempt_id. The new checks only looked for the bound form, so after an upgrade a previously completed attempt looked unfinished - repeat InteractiveAggregate and the Round2 completion gate would no longer fail closed for it. Add interactive_attempt_aggregated(markers, attempt_id, digest) = bound form OR a legacy bare attempt_id marker (fail-closed on read), mirroring the consumed-marker helper. Use it at the three read sites (the Round2 completion gate and both interactive_aggregate pre-checks). New writes stay bound-only, so the durable record migrates forward on the next completion while legacy completions stay final. Test: interactive_honors_legacy_bare_aggregate_completion_marker injects a bare attempt_id marker (a pre-upgrade completion) and asserts Round2 then fails closed with InteractiveAttemptAlreadyAggregated. All 295 lib tests pass; cargo fmt clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e-review)
Codex re-review (P1, valid): attempt_id is derived from session/message/attempt#/
coordinator/included - NOT the taproot root. So the message-bound completion marker
(attempt_id@message_digest) still collided across taproot tweaks: a completion
aggregated for one root (or key-path None) recorded the same marker the Round2 gate
checks for a live seat opened with a different root, wrongly preempting that seat's
Round2 (and zeroizing it) even though the signatures differ per tweak.
Bind the canonical taproot root into the marker too -
interactive_aggregated_marker(attempt_id, message_digest, taproot_root) =
"{attempt_id}@{message_digest}@{root}" (root = hex, or "keypath" for None, which
cannot collide with a 64-hex root). interactive_aggregate writes it from the root it
aggregated under; the Round2 gate recomputes it from the root THIS member opened
with. The completion marker now binds the full signing-task identity (attempt + msg
+ root); the legacy bare-id fallback is unchanged.
Test interactive_round2_completion_marker_binds_taproot_root: a completion recorded
for a different root does not preempt a key-path member's Round2, while the same-root
completion does. All 296 lib tests pass; cargo fmt clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e-review) Codex re-review (P2, valid): when a multi-seat attempt aggregates with a threshold subset that EXCLUDES a local member which opened/Round1'd the same attempt + root, that member never calls Round2 (it is not a signer), so the Round2 completion gate never runs for it - its interactive_signing entry (nonces, key, message) and its live-member capacity slot stayed resident until the 1h TTL or an explicit abort. interactive_aggregate's success path now frees the LOCAL siblings finalized by the completion: after persisting the marker, remove + zeroize every interactive_signing entry on (attempt_id, taproot root) - the signers' entries were already removed at their Round2, and a sibling on a DIFFERENT root is a distinct signing task and is left untouched. The Round2 completion gate stays as the defense for a sibling that RE-OPENS the finalized attempt and tries to sign. Test interactive_round2_refused_after_aggregate_for_unsigned_sibling now asserts the non-signing sibling is freed at aggregation, then re-opens it and confirms the gate still refuses its Round2. All 296 lib tests pass; cargo fmt clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…o (re-review) Codex re-review (P2, valid): the finalized-sibling cleanup added in the previous commit matched entries on (attempt_id, taproot root) but NOT the message. Because attempt_id is provided by the caller separately from the package, a mismatched aggregate - a valid signing package for message B submitted under a live message-A attempt's id (same root) - would delete the message-A seats' live nonce/commitment state, forcing that unrelated attempt to restart, even though the stored marker (now message-bound) correctly would not match its Round2. Match the FULL finalized identity in the cleanup filter: attempt_id AND hash_hex(entry.message_bytes) == aggregated_message_digest AND taproot root - the same (attempt + message + root) identity the completion marker binds. Test interactive_aggregate_cleanup_is_message_bound: a valid stateless aggregate over message B under message A's attempt id leaves the live message-A seat intact. All 297 lib tests pass; cargo fmt clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t operators (#4098) ## What Keys a session's interactive signing state by `member_identifier` (`interactive_signing: Option<InteractiveSigningState>` → `BTreeMap<u16, InteractiveSigningState>`) so a **multi-seat operator can run interactive signing for all its seats in one process**. ## Why A multi-seat operator runs concurrent interactive signing per seat, all sharing one member-independent `SessionID`, in one process (the engine state is a process-global singleton). Today that breaks three ways: the 2nd seat's `InteractiveSessionOpen(same session, different member)` → `SessionConflict`; `round2` frees the whole state (wiping siblings); and the per-attempt consumed-nonce marker makes a sibling's `round2` falsely trip `ConsumedNonceReplay`. Surfaced by the keep-core real-cgo e2e (#4097). ## Design (reviewed: Gemini sound, Codex approve-with-refinements) Both reviewers rejected a session-wide live-attempt model — it would let one seat erase another's nonces, the exact bug being removed. - **Per-member live attempt**: `open(M, strictly-newer attempt)` replaces only M's entry (zeroizing M's old nonces); a stale/equal open for M is rejected, but a sibling on a different attempt is untouched. Seats are independent — exactly as separate processes would be. - **round2** removes only the member's entry (siblings stay live); the durable marker carries replay protection. - **Consumed markers** keyed per-`(attempt_id, member_id)` (composite); a **legacy bare `attempt_id` marker is honored fail-closed** on read. `aggregated_interactive_attempt_markers` stay per-attempt (one signature per attempt over public data). - **abort** is session-level over the map (removes all matching entries); **TTL sweep** is per-entry by `opened_at_unix` (siblings survive); **capacity** counts live member entries (a new member is a slot; a same-member replacement isn't). - `interactive_state_for_attempt_mut` is a member-keyed lookup. Live state still never persists (empty map on reload). Zeroization preserved via `InteractiveRound1State::Drop`. ## Tests New `interactive_multi_seat_two_members_one_process_aggregate_bip340`: two seats open the same attempt in one process, Round1 independently, member 1's Round2 frees only its entry and writes only its marker (member 2 stays live and unblocked), and the two interactive shares **aggregate to a valid BIP-340 signature**. Existing single-member tests updated for the map. **All 292 lib tests pass; `cargo fmt` clean.** Follow-ups (additional edge tests Codex itemized — per-member attempt advance, abort-by-attempt over multiple members, capacity new-vs-replacement) can land in review. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…attempt edge cases Two follow-up edge tests Codex itemized for the multi-seat interactive engine fix (#4098), test-only: - interactive_capacity_counts_new_members_not_replacements: at a live-member cap of 1, a member advancing to a newer attempt replaces its own entry (no new slot, succeeds), while a DIFFERENT member is a new entry that trips the cap and fails closed - pinning that the cap counts member entries, not sessions. - interactive_abort_by_attempt_removes_all_members_on_that_attempt: abort with an attempt_id filter is session-level over the member map - it removes every local seat on that attempt while a sibling seat on a different attempt survives. All 299 lib tests pass; cargo fmt clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…attempt edge cases (#4100) ## What Two follow-up edge tests Codex itemized during the multi-seat interactive engine fix (#4098), test-only: - **`interactive_capacity_counts_new_members_not_replacements`** — at a live-member cap of 1: a member advancing to a newer attempt *replaces* its own entry (no new slot, succeeds), while a *different* member is a new entry that trips the cap and fails closed. Pins that the cap counts member entries, not sessions. - **`interactive_abort_by_attempt_removes_all_members_on_that_attempt`** — abort with an `attempt_id` filter is session-level over the member map: it removes every local seat on that attempt while a sibling seat on a *different* attempt survives. All 299 lib tests pass; `cargo fmt` clean. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…tc_abi_version)
The Go cgo bridge and libfrost_tbtc are compiled separately and linked at runtime via
dlsym; a symbol that resolves but changed MEANING (struct/JSON contract change) is
silent corruption. The existing frost_tbtc_version returns a human string
("tbtc-signer/0.1.0-bootstrap") that cannot be machine-checked.
Add frost_tbtc_abi_version() returning a structured FFI CONTRACT version
{abi_major, abi_minor} (api::FrostTbtcAbiVersionResult), so a consumer can fail closed
against an incompatible lib (the Go-side assertion lands separately). Starts at 1.0 -
NOT 0.x, to avoid semver pre-1.0 ambiguity. abi_major = any incompatible contract
change (C signatures, JSON field meaning, required fields, enum/status values,
serialization, memory ownership, crypto transcript semantics); abi_minor = a purely
additive, backward-compatible change old consumers safely ignore. A consumer requires
lib.abi_major == its_major AND lib.abi_minor >= the minor it needs. The human
frost_tbtc_version string stays informational and unchanged.
Design validated via Codex consult (major.minor; exact major, minimum minor; the
{abi_major, abi_minor} JSON is the frozen root compatibility surface).
300 lib tests pass (new: abi_version_reports_the_contract_version pins 1.0);
cargo fmt clean; the symbol is exported in the built cdylib.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Codex #4104 P2: the new #[no_mangle] frost_tbtc_abi_version is part of the public C ABI, but header-based consumers that compile against include/frost_tbtc.h had no prototype for it - the exported symbol and the advertised header could drift. Add the declaration alongside frost_tbtc_version (the header is hand-maintained; no cbindgen). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…tc_abi_version) (#4104) ## What Adds `frost_tbtc_abi_version()` — a structured **FFI contract version** `{abi_major, abi_minor}` — so the Go cgo bridge can **fail closed** against an incompatible `libfrost_tbtc` instead of silently misinterpreting a changed contract. The bridge and the lib are compiled separately and linked at runtime via `dlsym`; a symbol that still resolves but changed *meaning* (struct/JSON contract change) is silent corruption. The existing `frost_tbtc_version` returns a human string (`tbtc-signer/0.1.0-bootstrap`) that can't be machine-checked — this is the checkable companion. ## Scheme (Codex-validated) - **`abi_major`** — any *incompatible* change to the Go↔Rust contract: C signatures, JSON field meaning, required fields, enum/status values, serialization, memory ownership, crypto transcript semantics. - **`abi_minor`** — a purely *additive*, backward-compatible change old consumers safely ignore. - A consumer requires `lib.abi_major == its_major` **and** `lib.abi_minor >= the minor it needs`. - Starts at **1.0** (not 0.x, to avoid semver pre-1.0 ambiguity). The `{abi_major, abi_minor}` JSON is the **frozen root compatibility surface**. - `frost_tbtc_version` (human string) stays informational and unchanged. ## Notes - The **Go-side assertion** (fail-closed preflight + the `requiredMajor`/`requiredMinMinor` constants + a CI-gate check) lands as a separate keep-core PR that also bumps the signer pin to this commit. - 300 lib tests pass (new `abi_version_reports_the_contract_version` pins 1.0); `cargo fmt` clean; the symbol is exported in the built cdylib. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…ctive round start_sign_round cleared the active sign round on an authorized advance *before* validating the incoming attempt context against the deterministic RFC-21 coordinator selection. A malformed advance whose transition evidence was internally consistent but whose coordinator_identifier failed deterministic validation destroyed the in-memory round, then returned an error without persisting. Because the original attempt id stayed in consumed_attempt_ids, that attempt could never be re-signed in-memory, bricking the signing session until the durable (un-cleared) state was reloaded on restart. Run validate_attempt_context on the incoming context before clear_active_sign_round_for_attempt_transition, so a rejected advance leaves the active round intact. Add a regression test that forges the coordinator on an otherwise-valid advance and asserts the original attempt remains signable. This path is gated off in production by enforce_transitional_signing_disabled_in_production, so the impact is limited to the dev/staging transitional-nonce path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ng window Two fail-open / DoS fixes in the signing-policy config surface: - signer_profile_is_production() panicked on any TBTC_SIGNER_PROFILE value other than production/development/empty. On the env-fallback path that value is unvalidated, so a single typo (e.g. "staging") aborted every profile-gated FFI operation as a generic internal error -- a process-wide denial of service. Treat an unrecognized profile as production (fail closed) and warn once instead of panicking. - load_signing_policy_firewall_config() accepted an equal-bounds UTC window (start == end). utc_hour_in_window treats start == end as "always in window", so the time-of-day control was silently disabled (fail open). Reject equal bounds, and out-of-range (>= 24) hours, at load time. Add regression tests for both. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The signer-dependency-audit job runs cargo-deny against the committed Cargo.lock, but the clippy/test jobs and build.sh resolved dependencies without --locked. A PR that edits Cargo.toml so the committed lock no longer satisfies it would let those jobs silently re-resolve to newer, unaudited crate versions (running their build scripts and proc-macros on the runner) while the advisory gate still audits the stale lock -- so the security gate and the code actually exercised could diverge. Add --locked to the clippy, test, formal-test, and release-build invocations so a lock/manifest mismatch fails loudly instead of silently re-resolving. cargo fmt is intentionally left unchanged (it does not accept --locked). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ctive round (#4111) ## Summary Follow-up to #4005 (FROST/ROAST signer). Fixes a session-bricking bug in the attempt-advance path of `start_sign_round`. On an authorized attempt advance, `start_sign_round` cleared the active sign round (`clear_active_sign_round_for_attempt_transition`) **before** validating the incoming attempt context against the deterministic RFC-21 coordinator selection. A malformed advance whose transition evidence was internally consistent but whose `coordinator_identifier` failed deterministic validation destroyed the in-memory round, then returned an error **without persisting**. Because the original attempt id stayed in `consumed_attempt_ids`, that attempt could never be re-signed in-memory — the signing session was bricked until the durable (un-cleared) state was reloaded on restart. ## Fix Run `validate_attempt_context` on the incoming context **before** clearing the active round, so a rejected advance leaves the round intact. The same validation already ran later on the fresh-attempt path, so no legitimate advance is newly rejected — the check is simply moved ahead of the destructive clear. ## Test Adds `rejected_forged_advance_preserves_active_sign_round`: it forges the coordinator on an otherwise-valid advance and asserts the original attempt remains signable. Verified to **fail** against the unfixed code (`ConsumedAttemptReplay`) and **pass** with the fix. ## Scope Gated off in production by `enforce_transitional_signing_disabled_in_production`, so impact is limited to the dev/staging transitional-nonce path. _Found during review of #4005._ 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…ng window (#4112) ## Summary Follow-up to #4005 (FROST/ROAST signer). Two fail-open / DoS fixes in the signing-policy config surface. ### 1. `signer_profile_is_production()` panicked on an unknown profile value On the env-fallback path `TBTC_SIGNER_PROFILE` is unvalidated, so any value other than `production`/`development`/empty hit `panic!`. The panic is caught at the FFI boundary and redacted to a generic internal error, so a single typo (e.g. `staging`) aborted **every** profile-gated operation — a process-wide denial of service until the env var was corrected. Now an unrecognized profile is treated as `production` (fail-closed: the strictest gate set) with a one-time warning, instead of panicking. Production is the safe/strict direction for every caller (provenance gate, bootstrap-dealer-DKG disable, transitional-signing disable, ROAST strict mode, env state-key-provider rejection). The install path (`init_config`) still validates the profile, so this only affects the env-fallback path. ### 2. `load_signing_policy_firewall_config()` accepted a zero-width UTC window `utc_hour_in_window` treats `start == end` as "always in window", so an equal-bounds window silently disabled the time-of-day control (fail-open) rather than restricting it. The loader now rejects equal bounds — and out-of-range (`>= 24`) hours — at load time. ## Tests - `unknown_profile_value_fails_closed_to_production` - `signing_policy_firewall_rejects_equal_utc_window_bounds` _Found during review of #4005._ 🤖 Generated with [Claude Code](https://claude.com/claude-code)
## Summary Follow-up to #4005 (FROST/ROAST signer). Pins cargo dependency resolution in CI and the release build. The `signer-dependency-audit` job runs `cargo-deny` against the committed `Cargo.lock`, but the clippy/test jobs and `build.sh` resolved dependencies **without** `--locked`. A PR that edits `Cargo.toml` so the committed lock no longer satisfies it would let those jobs silently re-resolve to newer, unaudited crate versions (running their build scripts and proc-macros on the runner) while the advisory gate still audits the stale lock — so the security gate and the code actually exercised could diverge. ## Fix Adds `--locked` to the `clippy`, `test`, formal-`test`, and release-`build` invocations so a lock/manifest mismatch fails loudly instead of silently re-resolving. `cargo fmt` is intentionally left unchanged (it does not accept `--locked`). _Found during review of #4005._ 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…he outcome persist_engine_state_to_storage resolved the state-encryption key inside encode_encrypted_state_envelope, i.e. while the caller held the global ENGINE_STATE mutex. For the `command` (KMS/HSM) provider that forks and waits on the key subprocess on every state mutation while the lock is held, so a slow or hung key command stalled every concurrent signer operation for up to the configured timeout (default 30s, max 300s). Split persist into a thin wrapper (cold paths and tests) and persist_engine_state_to_storage_with_key, which takes pre-resolved key material. Every hot per-operation path (run_dkg, start/finalize_sign_round, build_taproot_tx, trigger_emergency_rekey, promote/rollback_canary, refresh_shares, interactive round2/aggregate) now resolves the key with state_encryption_key_material() BEFORE acquiring the lock, so the subprocess never runs under the lock. Only the in-memory serialize + encrypt + atomic write stay under the lock (they must, to preserve write ordering and avoid lost updates between concurrent ops). The resolution outcome is DEFERRED, not propagated eagerly: the key Result is consulted (via require_resolved_state_key) only at a site that actually persists. Idempotent replays and pre-persist rejections return without ever consulting it, so a transient key-provider outage cannot turn a cached read (e.g. re-serving an already-computed signature on a ROAST retry) into a failure. In the interactive round2/aggregate paths, the key is resolved into a local BEFORE the consumption/aggregation marker is inserted, so a key failure returns cleanly (no marker written) rather than escaping the marker-rollback block. Semantics preserved: the key is still resolved fresh on every call (no caching), so external KMS rotation and trigger_emergency_rekey keep picking up the current key. Tests: idempotent_build_tx_replay_survives_state_key_outage (a replay returns its cached tx with a failing key command) and interactive_round2_state_key_failure_does_not_burn_attempt (a Round2 that fails at the key step leaves no consumption marker and the attempt still succeeds on retry). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… not before it #4114 resolved the state-encryption key OFF the ENGINE_STATE lock to keep the command-provider KMS subprocess from stalling the lock. But ENGINE_STATE already serializes the persisted-state WRITES, so resolving the key before the lock left key selection out-of-order with the writes: a caller that resolved an older key (before a provider key rotation) could win the LAST write after a concurrent caller already persisted the rotated key, tagging the single state file with the stale key id. decode_encrypted_state_envelope accepts only the currently-resolved key id, so a restart then rejects the file as a key-id mismatch and the persisted state is unreadable. Resolve the key UNDER the held ENGINE_STATE guard, at the persist site, so key selection is in the same serialized order as the writes -- the last writer always encrypts with the then-current key. Non-marker sites call persist_engine_state_to_storage(&guard) (resolves under the guard then persists); the two interactive marker sites resolve via state_encryption_key_material()? BEFORE inserting the durable marker (preserving fail-closed-before-marker and the marker-rollback-on-persist-failure), then persist_with_key. The deferred require_resolved_state_key helper is removed. Idempotent replays and pre-persist rejections still return before any key resolution, so a transient key-provider outage cannot fail a non-persisting call (the idempotent_build_tx_replay_survives_state_key_outage regression still holds). TRADEOFF: this reverses #4114's off-lock optimization -- under the `command` provider the KMS subprocess now runs under the guard during actual persists (a bounded head-of-line stall; a hung command is bounded by terminate_state_key_command). A future off-lock-preserving design would need a cheap rotation signal from the key provider; recovering that is left as a follow-up. Addresses the Codex P1 review on #4114. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…TATE guard, in write order (#4114) ## Summary Resolves the at-rest state-encryption key **under the held `ENGINE_STATE` guard, at the persist site**, so key selection is ordered with the writes the guard already serializes. This PR began as a perf change (resolve the key *off* the lock to keep the `command`-provider KMS subprocess from stalling `ENGINE_STATE`). A Codex review found that approach introduced a **P1 state-loss bug**, so the PR now lands the correct ordering instead. ## The bug the off-lock approach introduced `ENGINE_STATE` already serializes the persisted-state *writes*, but resolving the key *before* the lock left key selection out-of-order with the writes: a caller that resolved an older key (before a provider key-rotation) could win the **last** write after a concurrent caller already persisted the rotated key, tagging the single state file with the stale key-id. `decode_encrypted_state_envelope` accepts only the currently-resolved key-id, so a restart then rejects the file as a key-id mismatch and the persisted state is **unreadable**. ## Fix Resolve the key under the held guard, at the persist site — the last writer always encrypts with the then-current key. Non-marker sites call `persist_engine_state_to_storage(&guard)` (resolves under the guard, then persists). The two interactive marker sites resolve via `state_encryption_key_material()?` **before** inserting the durable consumed/aggregated marker (preserving fail-closed-before-marker + marker-rollback-on-persist-failure), then `persist_..._with_key`. The deferred `require_resolved_state_key` helper is removed. **Invariants preserved:** idempotent replays / pre-persist rejections still return *before* any key resolution, so a transient key-provider outage cannot fail a non-persisting call (`idempotent_build_tx_replay_survives_state_key_outage` still holds). ## Verification Clean `cargo build` (no warnings); **all 329 crate tests pass**. An independent adversarial review confirmed all four ordering invariants at every converted site (0 correctness findings). ## Tradeoff (tracked in #4121) This reverses the off-lock optimization: under the `command`/KMS provider the subprocess now runs under the guard during actual persists (a bounded head-of-line stall; `env` provider is unaffected). Because the `production` profile mandates the `command` provider, this is the production path. Recovering off-lock safely needs added structure (a cheap provider key-id signal, a coordinated rotation protocol, or a partial-recovery persist-lock) — analyzed and tracked in **#4121**, gated on measuring whether the stall actually bottlenecks throughput. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Fixes two Codex-flagged P2 state-consistency holes in start_sign_round. 1. Persist failure left in-memory state diverged from durable state. After the fresh-round path mutated the session (consumed-replay markers, round state), a failed persist returned an error but the canonical idempotent serve then returned signature shares WITHOUT persisting -- so a restart could replay the round with no durable consumed marker. The idempotent cached serve now persists before serving when the round is not yet durable, tracked by a process-local SIGN_ROUND_PERSIST_PENDING marker; when the original persist already succeeded it still serves cached without persisting, preserving the 'idempotent replay survives a state-key-provider outage' property build_taproot_tx relies on (rollback is impossible -- the transition clear zeroizes the prior round material). 2. Active attempt cleared before later validation could fail. On an authorized ROAST attempt advance, clear_active_sign_round_for_attempt_transition ran before the fresh-path checks (participant resolution, included-set equality, quarantine, consumed-replay, share construction). A malformed advance that passed authorization but failed a later check destroyed the in-memory active round with no validated/persisted replacement, so the next StartSignRound could start a fresh attempt without transition evidence until a restart. The clear is now deferred until every fallible check has passed, just before the replacement round is installed and persisted. Adds two regression tests, both verified to fail against the pre-fix code. Design validated with Codex. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codex re-review: the SIGN_ROUND_PERSIST_PENDING marker was process-global but cleared only at start_sign_round's own persist sites. If a start_sign_round persist failed (marker set) and a later UNRELATED successful persist (e.g. a DKG for another session) then wrote the whole engine state -- making that round durable -- the marker stayed stale-true. A subsequent idempotent replay of the now-durable round during a state-key-provider outage would re-enter the persist branch, try to persist again, and fail instead of serving the cached round. Move the marker into the persistence module and clear it inside persist_engine_state_to_storage_with_key on any successful write, so any operation's successful persist clears it. start_sign_round sets it on a fresh-round mutation (mark_sign_round_persist_pending) and reads it in the idempotent serve (sign_round_persist_pending); the explicit clears at the start_sign_round persist sites are removed (the persist clears it now). Adds a regression test (verified to fail against the pre-fix code) covering the unrelated-persist-then-outage replay. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codex re-review: the process-global SIGN_ROUND_PERSIST_PENDING bit conflated all sessions. After one session's StartSignRound established a round and failed to persist, the bit stayed set process-wide, so the idempotent cached-serve branch -- which consulted it for EVERY session -- would force an UNRELATED, already- durable session's replay to re-persist and fail during the same state-key outage instead of serving its durable cached shares. An availability regression. Replace the global AtomicBool with a per-session set (SIGN_ROUND_PERSIST_PENDING_SESSIONS: OnceLock<Mutex<BTreeSet<String>>>) keyed by session_id: mark the specific session on a fresh-round mutation, consult that session in the cached-serve branch, and clear the WHOLE set on any successful persist (a persist writes the entire engine state, so every in-memory round becomes durable at once -- preserving the cross-operation durability the prior commit established). reset_for_tests clears the set explicitly. Both invariants stay intact: a non-durable round's replay still re-persists before serving (durability); a durable round's replay still serves without persisting (availability); a different session's failed persist no longer drags down an unrelated durable session. Adds a regression test (verified to fail against the global bool) and corrects the marker doc comment (mutual exclusion is the inner mutex, not the ENGINE_STATE guard, since clear also runs off-guard at startup and in test reset). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…#4123) ## Summary Fixes two Codex-flagged P2 state-consistency holes in `start_sign_round` (`pkg/tbtc/signer/src/engine/signing.rs`), where a failed persist or a later validation error could leave in-memory state diverged from durable state and bypass the ROAST replay/transition protections. ### 1. Persist before serving the idempotent cached round (bug #1) The fresh-round path mutates the session (consumed-replay markers + round state) and then persists. If that persist failed (state-key-provider or disk error) it returned an error and served no shares — but the **canonical idempotent serve** then returned signature shares **without persisting**, so a restart could replay the round with no durable consumed marker. Fix: the idempotent cached serve now persists before serving **when the round is not yet durable**, tracked by a process-local `SIGN_ROUND_PERSIST_PENDING` marker (set on fresh-round mutation, cleared after a successful persist). When the original persist already succeeded — the common case — it still serves the cached round **without** persisting, preserving the "idempotent replay survives a state-key-provider outage" property that `build_taproot_tx` relies on. (Rollback is impossible here: the transition clear zeroizes the prior round material.) ### 2. Defer clearing the active attempt until validation finishes (bug #2) On an authorized ROAST attempt advance, `clear_active_sign_round_for_attempt_transition` ran **before** the fresh-path checks (participant resolution, included-set equality, quarantine, consumed-replay, share construction). A malformed advance that passed authorization + the RFC-21 coordinator check but failed a **later** check destroyed the in-memory active round with no validated/persisted replacement — so `active_attempt_context` was dropped and the next `StartSignRound` could start a fresh attempt **without transition evidence** until a restart reloaded durable state. Fix: the advance is authorized but the clear is **deferred** until every fallible fresh-path check has passed, immediately before the replacement round is installed and persisted (the idempotent/conflict branch is skipped for an authorized advance via a flag). ## Tests Two regression tests, both **verified to fail against the pre-fix code**: - `authorized_advance_failing_later_check_preserves_active_sign_round` — an authorized advance that fails the included-set check leaves attempt 1 idempotently signable (pre-fix: `ConsumedAttemptReplay`). - `idempotent_sign_round_replay_persists_before_serving_after_persist_outage` — a sign round whose persist fails does not serve shares on idempotent replay until the state-key provider recovers (pre-fix: served shares with the key down). Full lib suite: **308 passing** (`cargo test --lib`), `cargo fmt --check` clean. Design validated with Codex. _Found during the Codex review of #4005._ Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Summary
Lands the Rust FROST/ROAST signer at
pkg/tbtc/signer/alongside the existing Go DKG coordinator. The initial import was extracted fromtlabs-xyz/tbtc:feat/frost-schnorr-migrationat frozen signed tagfrost-extraction-source-v1, then this PR applied targeted canonical-path, CI, and reviewer hardening follow-ups in the canonical keep-core repo.This is no longer a pure byte-for-byte mirror PR. The post-freeze changes are intentional canonical keep-core changes (canonical-path normalization, CI wiring, and reviewer hardening). After this PR lands,
threshold-network/keep-coreis the source of truth for the signer; the extracted monorepo tag is provenance for the initial import only.Source-PR Provenance
tlabs-xyz/tbtcfeat/frost-schnorr-migrationfrost-extraction-source-v1(signed annotated tag by maclane)52389bd5cccb5daeef195671feb7ca46be6e2f37The monorepo provenance above is used only to audit the initial import. Ongoing signer development, fixes, CI, and reviews happen in
threshold-network/keep-core.Why This Lives In keep-core
Per extraction plan v38 section 3.1, the signer co-locates with keep-core because:
Layout Introduced
CI Coverage
This PR now runs:
cargo fmt --manifest-path pkg/tbtc/signer/Cargo.toml -- --checkcargo clippy --manifest-path pkg/tbtc/signer/Cargo.toml --all-targets -- -D warningscargo test --manifest-path pkg/tbtc/signer/Cargo.tomlcargo test --manifest-path pkg/tbtc/signer/Cargo.toml formal_verification_pkg/tbtc/signer/scripts/formal/run_tla_models.shVerification
Latest local verification after review fixes:
cargo fmt --manifest-path pkg/tbtc/signer/Cargo.toml -- --checkcargo clippy --manifest-path pkg/tbtc/signer/Cargo.toml --all-targets -- -D warningsTBTC_SIGNER_STATE_PATH=/tmp/tbtc-signer-ci-state-local-4005.json cargo test --manifest-path pkg/tbtc/signer/Cargo.tomlcargo test --manifest-path pkg/tbtc/signer/Cargo.toml formal_verification_Latest GitHub CI on this branch is green, including
Signer Rust checks,Signer formal invariants, andTLA model checks.Relationship To PR #3866
PR #3866 lands the Go-side DKG coordinator. This PR lands the Rust-side signer. They are complementary protocol implementations:
PR #3866 and this PR can land independently in either order.
Plan Context
This is one of the mergeable mirror PRs comprising the FROST extraction, alongside tbtc-v2, tbtc-subgraph, and tbtc-v3-indexer mirror PRs.