Skip to content

[DNM] feat(da): support fiber (not via c-node)#3244

Draft
julienrbrt wants to merge 17 commits intomainfrom
julien/fiber
Draft

[DNM] feat(da): support fiber (not via c-node)#3244
julienrbrt wants to merge 17 commits intomainfrom
julien/fiber

Conversation

@julienrbrt
Copy link
Copy Markdown
Member

@julienrbrt julienrbrt commented Apr 13, 2026

Overview

Support Fiber client (based on https://github.com/celestiaorg/celestia-app/blob/63fbf31cca216fc4e067a9e1b3a3431115c7009b/fibre), but not via celestia node or apex for this PoC
celestiaorg/celestia-node#4892

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 13, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 26972046-f093-4d23-a916-c630c53e181f

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch julien/fiber

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 13, 2026

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedApr 22, 2026, 8:08 PM

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 13, 2026

Claude finished @julienrbrt's task in 3m 52s —— View job


Code Review: [DNM] feat(da): support fiber (not via c-node)

  • Read AGENTS.md and project guidelines
  • Review fiber client implementation (block/internal/da/fiber_client.go)
  • Review fiber types (block/internal/da/fiber/types.go)
  • Review fibremock (block/internal/da/fibremock/mock.go)
  • Review tests (fiber_client_test.go, fibremock/mock_test.go)
  • Review public API changes (block/public.go)
  • Review config changes (pkg/config/config.go, pkg/config/defaults.go)
  • Review sequencer changes (pkg/sequencers/solo/sequencer.go)
  • Review CLI changes (pkg/cmd/run_node.go)
  • Review local-fiber tool (tools/local-fiber/)
  • Compile and summarize findings

Overall Assessment

Good foundational PoC structure — the layered design (interface in fiber/types.go, adapter in tools/local-fiber/, internal client in da/fiber_client.go) is clean and the mock is well-implemented. A few correctness issues and one security concern need addressing before this can graduate from PoC.


Critical Issues

1. Subscribe — download hangs without a timeout

block/internal/da/fiber_client.go:252

// BUG: no timeout — if Download blocks, the goroutine hangs until ctx is cancelled
blobData, err := c.fiber.Download(ctx, event.BlobID)

Unlike Get and Submit, the Download call inside Subscribe doesn't wrap the context in context.WithTimeout. If the backing client stalls, the subscriber goroutine blocks indefinitely. Fix this →

2. Subscribe — nil blob silently propagated downstream

block/internal/da/fiber_client.go:253-256

if err != nil {
    c.logger.Error().Err(err).Bytes("blob_id", event.BlobID).Msg("failed to retrier blob id")
}
// blobData is nil here, but still sent downstream
select {
case out <- datypes.SubscriptionEvent{..., Blobs: [][]byte{blobData}}:

When Download fails, blobData is nil and the event is still sent with Blobs: [][]byte{nil}. Callers consuming this channel will receive a nil blob without any indication of failure. The event should be skipped (continue) on error. Also note the typo: "retrier" → "retrieve". Fix this →

3. fibremock — subscriber removal is racey with stale indices

block/internal/da/fibremock/mock.go:173-185

go func() {
    <-ctx.Done()
    m.mu.Lock()
    last := len(m.subscribers) - 1
    if idx <= last {              // idx was captured at subscription time
        m.subscribers[idx] = m.subscribers[last]  // but idx may have shifted!
    }
    m.subscribers = m.subscribers[:last]

The swap-with-last pattern breaks when another subscriber that was appended after the current one is removed first (by swapping it into an earlier slot), silently changing this subscriber's effective index. Example: subscribe A(idx=0), B(idx=1), C(idx=2). If B's context cancels first and B swaps with C (now C is at idx=1), when C's context cancels it tries to remove at idx=2 which is out of bounds (idx > last), so the swap is skipped, but m.subscribers[:last] truncates the wrong entry. Use a pointer-based or set-based subscriber map to avoid this. Fix this →

4. tools/local-fiber/keyring.go — test keyring backend in production code

tools/local-fiber/keyring.go:21

kr, err := keyring.New("ev-node", keyring.BackendTest, keyringPath, nil, nil)

keyring.BackendTest stores private keys unencrypted on disk. It is intended only for automated test environments. This file is in tools/local-fiber/ and will be wired into production nodes via the CLI flags added in pkg/cmd/run_node.go. The backend should be configurable (defaulting to keyring.BackendFile or keyring.BackendOS). Fix this →

5. tools/local-fiber/adapter.goListen is a no-op, sync path is broken

tools/local-fiber/adapter.go:63-67

func (a *Adapter) Listen(_ context.Context, _ []byte) (<-chan block.FiberBlobEvent, error) {
    ch := make(chan block.FiberBlobEvent)
    close(ch)   // immediately closed — no events will ever arrive
    return ch, nil
}

The fiberDAClient.Subscribe method depends entirely on Listen for its event stream. With this stub, full-node sync via Subscribe will never receive any blobs. Either wire in the real Fiber streaming API or return an explicit errors.ErrUnsupported so callers can handle it gracefully. The current silent close is especially dangerous because the Subscribe goroutine will see the closed channel and exit without logging anything.


High Severity

6. block/public.gopanic in a constructor

block/public.go:92-94

base, err := da.NewFiberClient(da.FiberConfig{...})
if err != nil {
    panic(err)
}

NewFiberClient only errors when cfg.Client == nil, which is checked by the caller, so this panic is unreachable in practice — but it's still bad API hygiene. Change NewFiberDAClient to return (FullDAClient, error) and propagate it. Fix this →

7. No config validation for Fiber fields

pkg/config/config.go:505-589

Config.Validate() has no check for FiberDAConfig fields when Fiber.Enabled = true. A node will panic/fail deep in startup if StateAddress or KeyringPath is empty. Add validation in Validate():

if c.DA.IsFiberEnabled() {
    if c.DA.Fiber.StateAddress == "" {
        return errors.New("da.fiber.state_address is required when fiber is enabled")
    }
    if c.DA.Fiber.KeyringPath == "" {
        return errors.New("da.fiber.keyring_path is required when fiber is enabled")
    }
    if c.DA.Fiber.KeyName == "" {
        return errors.New("da.fiber.key_name is required when fiber is enabled")
    }
}

Fix this →

8. latestHeight is an internal sequence counter, not a real DA height

block/internal/da/fiber_client.go:163

submitHeight := c.latestHeight.Add(1)

This counter starts at 0 per process restart and is not synchronized with the actual Celestia chain height. GetLatestDAHeight returns this counter. Any code that tries to correlate this with real block heights (e.g., DA sync logic) will get wrong results. This should either:

  • Be documented prominently in GetLatestDAHeight's godoc that it returns a local sequence number, not a chain height, or
  • Map to real heights via the event BlobEvent.Height which the Fiber protocol does provide

9. ExpiresAt is hardcoded in the adapter

tools/local-fiber/adapter.go:45

ExpiresAt: time.Now().Add(24 * time.Hour),

The Fiber protocol's actual retention period depends on validator policy and isn't guaranteed to be 24h. This should either come from the fibre.UploadResult if available, or be left as time.Time{} (zero = unknown).


Medium Severity

10. Duplicate blob data causes map collision in MockDA

block/internal/da/fibremock/mock.go:106-114

Because mockBlobID is deterministic (SHA256 of data), uploading the same data twice writes the same key twice to m.blobs (overwriting the first) but appends the key to m.order twice. This means eviction logic counts wrong and the same key can appear multiple times in m.order, causing incorrect eviction behavior. This is visible in TestMockDA_DeterministicBlobID which uploads the same data twice but doesn't verify BlobCount(). Consider deduplicating by checking m.blobs[key] != nil before appending to m.order.

11. Many important tests are skipped

block/internal/da/fiber_client_test.go

The following tests are t.Skip()'d with "not implemented":

  • TestFiberClient_Retrieve_Success / RetrieveBlobs_Success
  • TestFiberClient_Retrieve_NamespaceFiltering
  • TestFiberClient_Submit_MultipleBlobs
  • TestFiberClient_GetProofs_Success
  • TestFiberClient_Validate_Success / Validate_MismatchedLengths / Validate_WrongProof / Validate_EmptyProof
  • TestFiberClient_FullSubmitRetrieveCycle

Validate_MismatchedLengths and Validate_WrongProof are both skipped, but the current implementation returns true for all IDs regardless — these tests would fail if un-skipped. That means the skip masks a correctness issue, not just a missing feature.

12. GetProofs/Validate stubs return always-true

block/internal/da/fiber_client.go:277-298

// not implemented.
for i := range results {
    results[i] = true  // all proofs "valid"
}

These stubs are fine for a PoC, but should at least return an explicit sentinel error or use datypes.StatusNotImplemented so callers know not to rely on the result rather than silently trusting it.


Low / Style

13. Submitwg is never Wait()'d

block/internal/da/fiber_client.go:115-128

The sync.WaitGroup (wg) is used to spawn workers but wg.Wait() is never called. On an early error return, remaining in-flight uploads continue running against the buffered respCh and exit cleanly (no goroutine leak due to the buffered channel), but the wg itself is wasted. Either add defer wg.Wait() after the workers start, or remove wg entirely and rely on the taskCh close for coordination.

14. Inconsistent logging in tools/local-fiber/client.go

tools/local-fiber/client.go:40

fibreCfg.Log = slog.Default().WithGroup("fibre")

The rest of the codebase uses zerolog. Using slog.Default() means Fiber log output bypasses the configured log level and format. Thread the zerolog.Logger through Config and bridge it to slog via a zerolog-slog adapter.

15. fiberSubscribeChanSize = 42

block/internal/da/fiber_client.go:229

A buffer size of 42 is puzzling without a comment. If this is a deliberate balance between memory and backpressure, document it; otherwise use a round number or a named constant with a comment.


Sequencer Changes (pkg/sequencers/solo/sequencer.go)

The incidental changes to SoloSequencer are positive improvements:

  • Reusing pre-allocated emptyBatch, submitBatchResp, and verifyBatchOKResp avoids allocations on the hot path
  • The in-place partition loop (write cursor instead of append) is a good optimization
  • Early return on FilterTxs error with unfiltered txs is cleaner than the previous alloc-then-fill pattern
  • Removing the FilterRemove case keeps the switch exhaustive-by-default (was effectively a no-op before)

Summary

Severity Count Key items
Critical 5 Subscribe timeout/nil-blob, mock subscriber race, test keyring backend, Listen stub
High 4 panic in constructor, missing config validation, fake DA height, hardcoded TTL
Medium 3 mock deduplication bug, skipped tests masking correctness issues, always-true Validate
Low 3 WaitGroup unused, slog/zerolog mismatch, magic buffer size

The sequencer micro-optimizations are clean. The Fiber architecture is sound for a PoC, but the security issue with BackendTest and the broken Listen stub should be fixed before this is wired into any non-test deployment.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 13, 2026

Codecov Report

❌ Patch coverage is 90.65657% with 37 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.16%. Comparing base (2865d6d) to head (4485d91).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
block/public.go 0.00% 12 Missing ⚠️
block/internal/da/fibremock/mock.go 90.90% 5 Missing and 5 partials ⚠️
block/internal/da/fiber_client.go 96.74% 5 Missing and 3 partials ⚠️
pkg/sequencers/solo/sequencer.go 61.53% 5 Missing ⚠️
pkg/config/config.go 75.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3244      +/-   ##
==========================================
+ Coverage   62.33%   63.16%   +0.82%     
==========================================
  Files         122      124       +2     
  Lines       12873    13258     +385     
==========================================
+ Hits         8024     8374     +350     
- Misses       3968     3995      +27     
- Partials      881      889       +8     
Flag Coverage Δ
combined 63.16% <90.65%> (+0.82%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

julienrbrt and others added 7 commits April 14, 2026 15:12
Adds a fibremock package with:
- DA interface (Upload/Download/Listen) matching the fibre gRPC service
- In-memory MockDA implementation with LRU eviction and configurable retention
- Tests covering all paths

Migrated from celestiaorg/x402-risotto#16 as-is for integration.
@julienrbrt julienrbrt changed the title feat(da): support fiber (not via c-node) [DNM] feat(da): support fiber (not via c-node) Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants