Skip to content

feat: replace placeholder token estimates with real counts from agentic-kit#1210

Merged
pyramation merged 3 commits into
mainfrom
feat/real-token-counts
May 21, 2026
Merged

feat: replace placeholder token estimates with real counts from agentic-kit#1210
pyramation merged 3 commits into
mainfrom
feat/real-token-counts

Conversation

@pyramation
Copy link
Copy Markdown
Contributor

@pyramation pyramation commented May 21, 2026

Summary

Replaces all placeholder token estimates with real provider counts from @agentic-kit/[email protected].

Chat tokens (from earlier commits): OllamaAdapter.stream() returns Usage with real prompt_tokens, completion_tokens, reasoning_tokens, cache_read_tokens, cache_write_tokens. ChatFunction now returns ChatResult { content, usage } instead of string.

Embedding tokens (new commit): generateEmbedding() now returns EmbeddingResult { embedding, promptTokens } from Ollama's /api/embed endpoint (prompt_eval_count). Removes the ~4 chars/token placeholder estimate in metering.ts.

Changes

  • Bump @agentic-kit/ollama from ^1.2.1 to ^2.0.0 (breaking: generateEmbedding returns EmbeddingResult)
  • Add EmbeddingResult type to graphile-llm/types.ts, update EmbedderFunction return type
  • metering.ts: replace Math.ceil(text.length / 4) with real promptTokens from provider
  • metering.ts: record prompt_tokens in usage metadata and rawUsage
  • All embedding consumers updated to destructure { embedding, promptTokens }
  • CLI codegen template: extracts .embedding internally (keeps Promise<number[]> API for generated code)
  • Tests updated for new EmbeddingResult shape

Review & Testing Checklist for Human

  • Verify EmbeddingResult type flows correctly through metering → billing pipeline (check record_usage receives real promptTokens not estimated values)
  • Test with a real Ollama instance: buildEmbedder({ provider: 'ollama' }) should return { embedding: number[], promptTokens: number } with promptTokens > 0
  • Verify CLI --auto-embed still works (codegen template returns number[] not EmbeddingResult)
  • Check that quota-exceeded path in metering.ts logs inputTokens: 0 (no estimate since embedding wasn't called)

Notes

  • Depends on @agentic-kit/[email protected] (published, PR Feat/plan #11 merged in agentic-kit)
  • The as unknown as EmbedderFunction cast in metering-plugin.ts is intentional — the metered wrapper returns number[] | null (stripping token count after recording it), while llmEmbedder on Build is typed as EmbedderFunction. Downstream plugins (text-search, text-mutation) already cast to (text: string) => Promise<number[] | null>.
  • Some files include trailing-comma lint fixes from a previous session (no functional changes)

Link to Devin session: https://app.devin.ai/sessions/2b5a29d83d3f478e8d3d972653b4879c
Requested by: @pyramation

…ic-kit v1.2.1

- Bump @agentic-kit/ollama from ^1.0.3 to ^1.2.1
- Add LlmUsage and ChatResult types to graphile-llm
- Update ChatFunction to return ChatResult (content + usage) instead of string
- Rewrite chat.ts to use OllamaAdapter.stream() for real token counts
- Update metering.ts to extract real usage from ChatResult
- Rewrite llm-api.ts streaming + non-streaming paths to use OllamaAdapter
- Update rag-plugin to use chatResult.content and chatResult.usage.totalTokens
- Remove all placeholderAmountTokens usage
- Update test mocks to match new OllamaAdapter interface

Embeddings keep ~4 chars/token estimation (Ollama embedding API has no token counts).
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

- Bump @agentic-kit/ollama to ^2.0.0 (breaking: generateEmbedding returns EmbeddingResult)
- Add EmbeddingResult type { embedding: number[], promptTokens: number }
- Update EmbedderFunction return type to Promise<EmbeddingResult>
- Replace ~4 chars/token placeholder in metering.ts with real promptTokens
- Update all embedding consumers to destructure { embedding, promptTokens }
- CLI codegen template extracts .embedding internally (keeps Promise<number[]> API)
- Update tests for new return type shape
@devin-ai-integration devin-ai-integration Bot changed the title feat: replace placeholder token estimates with real counts from agentic-kit v1.2.1 feat: replace placeholder token estimates with real counts from agentic-kit May 21, 2026
@pyramation pyramation merged commit 4700a08 into main May 21, 2026
38 checks passed
@pyramation pyramation deleted the feat/real-token-counts branch May 21, 2026 23:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant