Skip to content

feat(frost/roast): attribute lost-sync triggers; document permissioned-set acceptance#4119

Merged
mswilkison merged 2 commits into
feat/frost-schnorr-migration-scaffoldfrom
fix/roast-lostsync-attribution-logging
Jun 27, 2026
Merged

feat(frost/roast): attribute lost-sync triggers; document permissioned-set acceptance#4119
mswilkison merged 2 commits into
feat/frost-schnorr-migration-scaffoldfrom
fix/roast-lostsync-attribution-logging

Conversation

@mswilkison

Copy link
Copy Markdown
Contributor

Summary

Follow-up to #3866. A small, behavior-preserving hardening + decision record for the ROAST transition-exchange lost-sync path (review finding #2).

Background. In onBundle, a seat that receives a transition bundle for an attempt it never observed trips markLostSync() — failing the wallet's signing retry loop closed — before any verification (a behind-seat lacks the observe handle full VerifyBundle needs). So an authenticated group member can broadcast a structurally-valid bundle with a bogus, never-committed attempt hash and halt every honest seat's signing. The blame bridge (PR2b-2) does not close this: a never-committed attempt produces no evidence to attribute the sender.

Decision (after Codex review + threat analysis). Under the permissioned operator set this residual is accepted: it's liveness-only (fail-closed — never an unsafe/divergent signature), the triggering seat is operator-authenticated (so attribution is immediate), and a misbehaving operator is governance-removable + economically deterred. The proper fix (f+1 snapshot-corroboration, or a resync state) is a real ROAST protocol change whose simple form can fracture the group on legitimate sparse-failure bundles — disproportionate while the set stays permissioned.

Change

  • Attribution logging. When a seat enters lost-sync from an unobserved-attempt bundle, log the sending seat and the claimed attempt-context hash — once per lost-sync episode — so the operational runbook can identify and remove/slash a member spamming bogus-attempt bundles.
  • markLostSync now uses CompareAndSwap and returns whether it transitioned, so the attribution is logged exactly once even while the listener keeps receiving such bundles. No change to the fail-closed semantics.
  • Decision record in-code: documents the permissioned-set acceptance and marks it as a hard item to revisit before any move to a permissionless operator set, where an anonymous/costless/non-attributable DoS would warrant the corroboration/resync fix.

Verification

gofmt + go vet clean; builds under -tags 'frost_native frost_roast_retry cgo frost_tbtc_signer'; the transition-exchange / lost-sync / bundle tests pass. The only caller of markLostSync is the updated site.

Found during review of #3866.

🤖 Generated with Claude Code

…d-set acceptance

When a seat enters lost-sync from a transition bundle for an attempt it never
observed, the triggering bundle is operator-authenticated -- log the sending seat
and the claimed attempt-context hash once per lost-sync episode, so the
operational runbook can attribute and remove/slash a member spamming
bogus-attempt bundles. (This is an authenticated-insider liveness halt the blame
bridge does NOT close, because a never-observed attempt yields no evidence to
attribute it.)

markLostSync now uses CompareAndSwap and returns whether it transitioned, so the
attribution is logged exactly once even while the listener keeps receiving such
bundles. No change to the fail-closed semantics.

Also document that this residual is accepted under the PERMISSIONED operator set
(attributable + liveness-only + governance-removable) and MUST be revisited
before any move to a permissionless set, where the f+1 snapshot-corroboration /
resync fix would be warranted instead of accepting it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: d05e7fec-e214-4e67-b5ba-e807436a89af

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/roast-lostsync-attribution-logging

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@mswilkison mswilkison merged commit f823ad1 into feat/frost-schnorr-migration-scaffold Jun 27, 2026
17 checks passed
@mswilkison mswilkison deleted the fix/roast-lostsync-attribution-logging branch June 27, 2026 22:06
mswilkison added a commit that referenced this pull request Jun 27, 2026
…ror type (#4122)

## Summary

Addresses a Codex finding (relayed via the #4119 review):
`TbtcChain.GetWallet` derived a **legacy** wallet ID on **any** error
from the canonical `walletID` accessor. For a FROST wallet on a
canonical Bridge, a transient call failure would silently yield the
left-padded legacy ID — and callers use `WalletChainData.WalletID` to
choose **P2TR (FROST)** vs **P2WPKH (legacy)** scripts, so the node
would build or search the **wrong wallet script**.

## Why route by scheme (revised after Codex P1)

The first revision distinguished by error type (a sentinel for the
missing accessor, surface everything else). Codex correctly flagged a
**P1 regression**: a *legacy on-chain Bridge* built with the *current*
generated binding still satisfies the accessor interface, so its missing
`walletID` function returns a normal RPC/ABI error — not the sentinel —
and that revision would surface it and **break `GetWallet` on exactly
the legacy deployments the fallback exists for**. Error type cannot
reliably separate "function absent on-chain" from "transient."

So this routes by **scheme**, using the wallet's `EcdsaWalletID` (which
`GetWallet` already reads, and which the codebase already uses to infer
scheme — zero ⇒ FROST):

- **Legacy ECDSA wallet** (`EcdsaWalletID != 0`): its canonical wallet
ID *equals* its legacy derivation, so fall back on **any** accessor
error — and it's the only option on a legacy Bridge lacking the
accessor.
- **FROST wallet** (`EcdsaWalletID == 0`): requires the canonical ID;
**surface** the error rather than return a wrong legacy ID. A FROST
wallet only exists on a canonical-ID Bridge, so such an error is
genuinely transient.

Logic is extracted into `resolveWalletID(bridge, walletPublicKeyHash,
ecdsaWalletID)`.

## Tests

`TestResolveWalletID` covers all four cases: accessor success →
canonical; FROST + accessor error → surfaced; **legacy + accessor error
→ legacy fallback** (the P1 regression guard — verified to fail if the
routing surfaces errors for legacy wallets); legacy + missing-accessor
binding → legacy fallback. gofmt + `go vet` clean; full
`pkg/chain/ethereum` suite passes.

_Found during the Codex review batch on #4115#4120; revised per the
Codex P1 re-review._

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant