perf(tui): fix the five render/input hot paths (#3896–#3900)#3902
perf(tui): fix the five render/input hot paths (#3896–#3900)#3902Hmbown wants to merge 8 commits into
Conversation
task_panel_rows() and task_panel_hover_texts() each independently called active_tool_rows(), reasoning_task_rows(), background_task_rows() (a clone+sort of app.task_panel), and recent_tool_rows() on every frame the Tasks panel was visible — doubling the row-building work per frame. Introduce TaskPanelRowSets, computed once in render_sidebar_tasks and passed by reference to both consumers. The focus-conditional live-tool dedup argument to background_task_rows (Tasks focus dedups, Auto mode does not) is preserved and now evaluated in one place. As a side benefit, both consumers now see one snapshot, so hover rows can no longer disagree with rendered rows across an Instant::elapsed TTL boundary within a single frame. Also stop double-cloning every SidebarAgentRow in sort_sidebar_agent_rows_as_tree: emit an index order first, then materialize by move (the `seen` set guarantees each index is emitted exactly once), noted as a related win in the issue. New tests pin the focus-conditional background dedup, rows/hover snapshot alignment, and exactly-once emission of agent rows (including the rootless parent-cycle orphan sweep). Fixes #3898
Toggling a directory in the file-tree pane called rebuild(), which re-walked the workspace root plus every expanded subtree with synchronous read_dir on the UI thread — freezing the UI for the duration of the walk on large directories. Only the initial build was offloaded (#399 S3). Collapse now needs no I/O at all: the directory's visible descendants are spliced out of the flat entry list in memory. Descendant expansion state stays in expanded_dirs, so re-expanding the parent restores it, exactly as the full rebuild did. Expand marks the entry expanded immediately (the ▼ acknowledges the keypress), then walks only the newly expanded subtree on a background thread via the same spawn_blocking_supervised + poll-cell pattern as the initial build, splicing the children in when the walk lands. Per-directory sequence numbers plus a collapse-clears-pending rule guarantee a stale walk can never splice into a re-toggled or collapsed node. Without a tokio runtime (plain unit tests) both the initial build and expand walk run synchronously, preserving test semantics. One behavioral delta, deliberate: the old full rebuild re-read every expanded directory on each toggle, so unrelated filesystem changes were incidentally picked up; the splice path only reads the newly expanded subtree. Tests cover splice-vs-full-rebuild parity across expansion orders, I/O-free collapse, descendant-state restoration, stale-result discarding, cursor stability across splices, and the async end-to-end path. Fixes #3900
… path With tool-run auto-collapse active (the default once any run of >=3 successful low-risk tools exists), ChatWidget::new rebuilt a Vec<HistoryCell> of every non-collapsed cell on every frame — deep-cloning the entire visible transcript (assistant/user bodies, tool outputs, sub-agent cards) at frame rate even when nothing had changed. The per-cell revision cache underneath never touches an unchanged cell, so the clones were pure waste. The filtered path now borrows cells instead of cloning them: the few synthetic tool-run summary cells are materialized into a stable local Vec first, then the filter builds a Vec<&HistoryCell> over history, active entries, and those summaries. TranscriptViewCache gains an ensure_filtered(&[&HistoryCell], ...) entry point; ensure_split and the new variant both delegate to one private iterator-driven helper, so the render/reuse/invalidation logic stays a single code path and the fast path plus all existing callers are untouched. Filtered sequence, revision values, collapsed_cell_map, and folded-thinking resolution are unchanged, so output is identical — now with zero cell clones on every frame, changed or not. New tests: ensure_filtered/ensure_split output parity, per-cell reuse through ensure_filtered, frame-stability of the collapse path at the ChatWidget level, and a collapsed run spanning the history/active boundary. Fixes #3896
The completion cache keys on the exact partial, so every keystroke
that changed the needle missed and ran a fresh synchronous workspace
walk on the UI thread — a gitignore-aware walk plus, for path-like
needles ('.', '/', '\\'), the gitignore-disabled local-reference walk
capped at 4096 paths (the same walk behind the #1921 WSL2 freezes).
Typing a path-like @mention in a large repo stalled the input loop on
every keystroke.
The walk now runs on a background thread (spawn_blocking_supervised +
shared-cell poll, same shape as the file-tree build and
workspace-context refresh): the cache key is factored into
MentionCompletionKey, at most one walk is in flight, and the latest
keystroke wins — a completed walk for an older needle is kept only as
the transient popup fallback under its true key, never returned as
the live needle's result. While a walk is in flight the popup keeps
showing the previous keystroke's entries (same session context only)
so it never closes mid-word and Enter keeps routing to the popup
instead of submitting the message. poll_mention_completion drains
finished walks once per event-loop iteration, sets needs_redraw only
when a result actually landed, and drops orphaned walks when the
cursor leaves the mention token.
Without a tokio runtime the walk runs synchronously, so completion
results for a given input are byte-identical to before and the
existing plain #[test] suite (including the cursor-move cache-reuse
and freshness pins) passes unchanged. Tab completion
(try_autocomplete_file_mention) stays synchronous: a deliberate
one-shot action, unlike per-keystroke popup refreshes.
Fixes #3899
…ing the whole message While a message streams, every committed chunk bumped the cell's revision and re-rendered the entire growing content: a full markdown parse plus a full word-wrap/span build, O(N) per chunk and O(N²/chunk) over the life of the stream. During an active stream chunks commit on nearly every 24ms loop tick, so long answers got visibly slower to render as they grew. The Ctrl+T overlay doubled the cost at a second width. The parser makes prefix stability provable: classification is per line with a single carried bool (the code-fence state), so appended bytes can never re-classify an earlier complete line — only the partial trailing line is volatile. Rendering is per block except consecutive table rows, whose column widths derive from the whole group, so a trailing table run stays volatile until a non-table block closes it. render_markdown_tagged_streaming keeps a small thread-local LRU of StreamRenderState slots (main transcript + overlay widths + a thinking body can interleave): per call it verifies the committed prefix with one memcmp, parses only newly completed lines, freezes their rendered lines, and re-renders just the volatile tail. Parse and render go through the same helpers as the full path (parse_line_into, render_blocks_tagged), and a debug_assert differential guard compares against the full re-parse on every call in debug builds, so the whole existing suite doubles as a byte-identity test. Streaming assistant cells route through the new path via render_assistant_message*; the final render after streaming ends takes the plain full-parse path, so finished messages are a full re-parse by construction. New tests: per-chunk byte-identity across chunk sizes 1/3/7 and widths 24/40/80 over headings/lists/fences/tables/links/CJK, an O(delta) parsed-line-count guard, non-extension fallback, two-width interleaving, trailing-table volatility, and char-by-char unclosed fences. Fixes #3897
An adversarial review of the five perf commits confirmed four real regressions in the two fixes that moved work off the UI thread; this closes all of them. File tree (#3900): a finished background expand walk never triggered a redraw — with the loop idle, an expanded directory looked empty until the next unrelated input event. The event loop now drains file-tree walks (initial build and expands) each iteration via poll_background() and sets needs_redraw when results were applied, matching the @mention poll wiring. Mention completion (#3899), three fixes: - Enter could submit a half-typed @mention during the first walk's in-flight window (popup routing keyed only on non-empty entries; the pre-async walk finished inside the key event, making that impossible). Enter is now held with a "Scanning files for @…" status while a walk for the live token is pending and the popup has nothing to show; Esc hides the pending popup; forced Ctrl+Enter submit keeps priority. - The stale popup fallback was unbounded: cached results for an UNRELATED earlier mention token could be shown and Enter/Tab would splice the wrong file over the live token. The fallback now requires same token lineage (live partial extends the cached one or vice versa) in addition to same context. - Superseded in-flight walks were dropped and respawned per keystroke; since a blocking walk cannot be cancelled, fast typing stacked concurrent full-workspace walks. The in-flight walk is now retained until it lands (its stale result is discarded), so real walks are serialized — at most one running — and the fresh needle's walk starts on the next call after landing. Tests: pending-expand drain requests exactly one repaint; fallback lineage positive/negative; in-flight walk retained across a needle change; mention_walk_pending true only for the live, un-hidden token. Follow-up to #3899/#3900 (PR #3902).
Resolves the conflicts with the #3861 constitution-first-setup merge: - @mention completion: main landed the #3757 candidate cache (one needle-independent walk serves every keystroke of a non-path-like mention token, ranked in memory with a 4s TTL). Kept it as the fast path, adapted to the MentionCompletionKey cache form, and kept this branch's #3899 background walk for exactly the needles the candidate cache bypasses (path-like and browser) — the two fixes compose: non-path needles rarely walk at all, and when a walk is needed it no longer blocks the UI thread. - app.rs: kept both structs/fields (MentionCompletionPending from #3899, MentionCandidateCache from #3757). - sidebar.rs: kept both tests added at the same location (this branch's tree-sort exactly-once test and main's terminal-row collapse test). cargo test -p codewhale-tui --locked after the merge: 5715 passed, 1 failed (sandbox::tests::test_parity_linux_landlock_available — pre-existing container-environment failure, kernel lacks Landlock).
Review — perf hot paths (verifying output-equivalence, not just speed)Read-only review focused on the thing that matters for a perf change to render code: does each optimized path produce byte-identical output, and does every new cache invalidate correctly? Traced each fix against its issue. Overall: merge-ready, with two tracked follow-ups. 4/5 fixes are correct with strong equivalence tests; the one real gap is bounded and has an Esc escape hatch. Per-fix verdicts
Suggested follow-ups (neither blocks the perf win)
Test coverage is genuinely strong (not asserted-by-comment) — the one untested window is the #3899 fast-Enter-applies-stale case above. (Detailed read; happy to open a small follow-up PR for the #3899 items if useful.) |
1. Strict equivalence for the stale popup fallback: entries are now filtered to those still matching the LIVE needle (case-insensitive substring, mirroring the walk's own match rule), so a fast Enter after a keystroke can only apply an entry the fresh walk would also have returned — the reviewer's `docs/apple.md` vs `@docs/ab` repro is pinned by a test. When the filter empties the list the popup reads as pending and the existing Enter-hold takes over. 2. A panicking walk can no longer soft-lock Enter for its token: the walker closure now routes its result through a completion guard whose Drop writes an empty result on unwind, so mention_completion_pending always drains. Requested in review on PR #3902.
|
Both #3899 follow-ups are in as
On the #3897 caveat: agreed and leaving the differential guard debug-gated as you suggest — every Post-push suite: 5717 passed / 1 failed ( Generated by Claude Code |
Summary
Fixes all five open
performance-labeled issues, one commit per issue, plus a follow-up commit closing four regressions found by an adversarial multi-agent review of the branch, plus a merge ofmain(v0.8.67).TaskPanelRowSetsis computed once inrender_sidebar_tasksand shared by the row renderer and hover-text builder; focus-conditional live-tool dedup preserved. Also removes the second full clone of every agent row insort_sidebar_agent_rows_as_tree.TranscriptViewCache::ensure_filtered;ensure_splitand the new variant share one iterator-driven helper. Zero cell clones per frame, byte-identical output.debug_assertdifferential guard makes the entire suite double as a byte-identity test against the full re-parse.Adversarial review round (34 reviewer/verifier agents over the five commits — sidebar/transcript/markdown came back clean): confirmed and fixed four async-window regressions in commit
80e058a— file-tree expand results now trigger a redraw when they land; Enter can no longer submit a half-typed mention during the first walk's in-flight window; the stale popup fallback is bounded to the same token lineage (no wrong-file splice); superseded walks are serialized instead of stacking concurrently.Merge with v0.8.67 (
4504153): main's #3757 candidate cache (one needle-independent walk serves every keystroke of a non-path-like mention token, ranked in memory) is kept as the fast path, adapted to the key-based cache form; this branch's #3899 background walk now covers exactly the needles that cache bypasses (path-like and browser). The two fixes compose: non-path needles rarely walk at all, and when a walk is needed it never blocks the UI thread.Testing
cargo fmt --all -- --checkcargo clippy --workspace --all-targets --all-featurescargo test -p codewhale-tui --locked(post-merge) — 5715 passed / 1 failed:sandbox::tests::test_parity_linux_landlock_available, which asserts the host kernel exposes Landlock and fails in this container on any branch (environment-only, untouched by this PR)cargo test --workspace --all-featuresTargeted suites post-merge:
mention50,sidebar::80,file_tree8,transcript::31,widgets::188,markdown_render38,history148 — all green. Each commit message describes its new tests.Checklist
🤖 Generated with Claude Code
https://claude.ai/code/session_01Ffi9Tm4TfXhK5ngpA68xw3