feat: refine token usage display modes by Layau-code · Pull Request #2329 · bytedance/deer-flow

Layau-code · 2026-04-17T09:28:27Z

Summary

This PR refines how token usage is displayed in the workspace.

The previous implementation exposed token usage at too fine a granularity in multi-step responses. In tool-call, subagent, and planning-heavy turns, a single assistant reply could render multiple token usage entries, which made the UI noisy and hard to understand.

This follow-up makes the display granularity explicit and introduces selectable token usage modes, with a cleaner default experience.

Changes

Token usage display modes

Add selectable token usage display modes in the workspace header:

Off: hide token usage
Summary: show only the top-level token total
Per turn: show one aggregated token usage entry for each assistant reply
Debug: show step-level token attribution for inspection/debugging

The default inline experience is now Per turn, which better matches the original goal: one user request + one assistant response should feel like one token usage unit.

Per-turn aggregation

Instead of rendering token usage for each internal step in a grouped assistant response, the UI now aggregates usage across the assistant turn and renders a single inline token usage item.

This keeps normal conversations readable while preserving cost visibility.

Debug mode for step-level attribution

Step-level token usage is still available, but it is now treated as a debug-oriented mode rather than the default experience.

In debug mode, token usage is attached to specific step labels where possible, instead of rendering as a detached list of raw token lines.

Examples include:

final answer
search-related steps
subagent dispatch
todo start / complete / update / remove actions

When a single AI step covers multiple actions, the UI explicitly treats it as a shared step total instead of pretending to provide exact per-tool token splits.

Backend attribution metadata

The backend now annotates AI steps with structured token_usage_attribution metadata.

This gives the frontend a more reliable attribution source for debug mode, especially for:

write_todos
subagent/task dispatch
search/tool batches
final-answer-only steps

The frontend still keeps a safe fallback path when attribution is missing or malformed.

Streaming/client consistency

Structured attribution metadata is now preserved in client serialization and streaming-related paths, so step-level token views do not depend only on history snapshots.

Test Results

Backend

Commands:

make format
make lint
PYTHONPATH=/app/backend uv run pytest tests/test_client.py tests/test_client_message_serialization.py tests/test_token_usage_middleware.py -v

Result:

148 passed

Frontend

Commands:

pnpm format:write
pnpm check
pnpm test

Result:

All frontend checks passed

Misc

Commands:

git diff --check

Result:

Passed

Checklist

Backend formatted with make format
Backend lint passes with make lint
Backend token usage related tests pass with PYTHONPATH=/app/backend uv run pytest tests/test_client.py tests/test_client_message_serialization.py tests/test_token_usage_middleware.py -v
Frontend formatted with pnpm format:write
Frontend lint + typecheck pass with pnpm check
Frontend unit tests pass with pnpm test
git diff --check passes
No provider credentials or local config changes included
Fixes improve: token用量展示优化 #2313

Notes

Debug mode remains AI-step-level attribution and does not attempt exact tool-level token accounting
the top-level token total still reflects currently available thread messages rather than a persisted ledger total
subagent-heavy conversations may still need a future ledger-based accounting model for fully stable totals

Copilot

Pull request overview

Refines token usage UX in the workspace by introducing explicit display modes, aggregating per assistant turn by default, and enabling step-level “debug” attribution backed by structured metadata from the backend.

Changes:

Add token usage view presets (Off/Summary/Per turn/Debug) and persist preferences in local settings.
Aggregate inline token usage once per assistant turn, with optional step-level debug rendering and labels.
Annotate AI steps on the backend with token_usage_attribution and preserve additional_kwargs through client serialization/streaming.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
frontend/tests/unit/core/messages/utils.test.ts	Adds coverage for per-turn aggregation helper.
frontend/tests/unit/core/messages/usage-model.test.ts	Adds coverage for presets/preferences mapping and debug step labeling/fallback behavior.
frontend/src/core/settings/local.ts	Introduces persisted `tokenUsage` local setting defaults and merge behavior.
frontend/src/core/messages/utils.ts	Exposes `getMessageGroups` and adds `getAssistantTurnUsageMessages` for per-turn usage aggregation.
frontend/src/core/messages/usage-model.ts	New model for token usage presets/preferences and step-level debug labeling (incl. backend attribution parsing).
frontend/src/core/i18n/locales/zh-CN.ts	Adds new token usage strings (presets, descriptions, debug labels).
frontend/src/core/i18n/locales/types.ts	Extends `Translations` types for new token usage UI strings.
frontend/src/core/i18n/locales/en-US.ts	Adds new token usage strings (presets, descriptions, debug labels).
frontend/src/components/workspace/token-usage-indicator.tsx	Replaces tooltip indicator with dropdown selector + totals display and preference updates.
frontend/src/components/workspace/messages/message-token-usage.tsx	Switches to per-turn summary rendering and adds debug list renderer.
frontend/src/components/workspace/messages/message-list.tsx	Wires inline token usage modes, per-turn aggregation, debug step rendering, and assistant-turn copy behavior.
frontend/src/components/workspace/messages/message-list-item.tsx	Removes per-message token usage rendering and adjusts copy toolbar behavior/positioning.
frontend/src/components/workspace/messages/message-group.tsx	Integrates step-level token debug summaries into chain-of-thought/tool call rendering.
frontend/src/app/workspace/chats/[thread_id]/page.tsx	Loads/saves local token usage preferences and passes inline mode to message list.
frontend/src/app/workspace/agents/[agent_name]/chats/[thread_id]/page.tsx	Same as above for agent-specific chat route.
backend/tests/test_token_usage_middleware.py	Adds tests for structured attribution metadata emitted by middleware.
backend/tests/test_client_message_serialization.py	Ensures `additional_kwargs` are preserved during message serialization.
backend/tests/test_client.py	Tests streaming behavior for propagating `additional_kwargs` updates to clients.
backend/packages/harness/deerflow/client.py	Preserves/streams `additional_kwargs` (incl. attribution) for AI/tool/human/system messages.
backend/packages/harness/deerflow/agents/middlewares/token_usage_middleware.py	Adds step attribution annotation logic and attaches it to AI messages via `additional_kwargs`.

Comments suppressed due to low confidence (1)

frontend/src/components/workspace/messages/message-list.tsx:298

In the subagent rendering loop, the subtask-count element is pushed for every AI message but always uses the same React key ("subtask-count"). If group.messages contains more than one AI message, this produces duplicate keys and can cause unstable rendering.

Consider moving the count element outside the loop, or include message.id/index in the key so each entry is unique.

                <div
                  key="subtask-count"
                  className="text-muted-foreground pt-2 text-sm font-normal"
                >
                  {t.subtasks.executing(tasks.size)}
                </div>,
              );

Layau-code added 2 commits April 17, 2026 17:19

feat: refine token usage display modes

f988bde

docs: clarify token usage accounting semantics

86fb45a

WillemJiang requested a review from Copilot April 17, 2026 15:34

Copilot started reviewing on behalf of WillemJiang April 17, 2026 15:34 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

Layau-code added 2 commits April 18, 2026 00:12

fix: avoid duplicate subtask debug keys

9b26cd1

style: format token usage tests

82c8c43

Layau-code mentioned this pull request Apr 18, 2026

improve: token用量展示优化 #2313

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: refine token usage display modes#2329

feat: refine token usage display modes#2329
Layau-code wants to merge 4 commits intobytedance:mainfrom
Layau-code:feature/refine-token-usage-display

Layau-code commented Apr 17, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Layau-code commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Token usage display modes

Per-turn aggregation

Debug mode for step-level attribution

Backend attribution metadata

Streaming/client consistency

Test Results

Backend

Frontend

Misc

Checklist

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Layau-code commented Apr 17, 2026 •

edited

Loading