feat(mcp): local semantic + hybrid search over saved transcripts by r3dbars · Pull Request #1326 · r3dbars/transcripted

r3dbars · 2026-06-25T18:10:31Z

What

Adds local, on-device semantic search to the read-only MCP server so paraphrase queries hit — e.g. search("pricing pushback") now finds "they balked at the cost". Today search is SQLite FTS5 lexical-only (TranscriptIndex.swift). This complements FTS, it does not replace it.

This is NEXT_WORK.md item #6 (L). The one thing cloud RAG still beat us on.

Model choice — lightest viable, zero bundle cost

The search/index lives in the standalone Tools/TranscriptedMCP SwiftPM package, which intentionally has no dependency on the app's MLX/FluidAudio model infra and builds via plain swift build. So I picked the lightest viable on-device option:

Apple NaturalLanguage — NLEmbedding.sentenceEmbedding(for: .english) (512-dim).

Bundled model: none. Download: none. It's built into macOS → negligible incremental app size, modest RAM, fully on-device/offline.
Builds in the standalone package with just import NaturalLanguage — no prebuilt frameworks, no CoreML packaging, no tokenizer to ship, no CI surface.
Behind an EmbeddingProvider protocol, so a bundled CoreML model (e.g. quantized MiniLM) can replace it later for higher quality without touching the store or search path.

App-size / RAM proof on a real signed build is Justin's to confirm, but by construction this adds no bundled assets — the only on-disk growth is the vector index (Float32 per utterance/entry) in the existing SQLite cache file, not the app bundle.

How it works

EmbeddingProvider.swift — protocol, NLEmbeddingProvider, SearchMode, VectorMath (normalize / cosine / blob).
EmbeddingStore.swift — vector store on its own SQLite connection to the same mcp_index.sqlite. Writes only additive tables (embedding_meta, utterance_vectors, dictation_entry_vectors), reads the lexical tables. Embeds new/changed rows lazily after each reconcile, drops orphaned vectors, re-embeds everything on a model-id/dimension change, and runs a streaming cosine scan honoring the same speaker/date filters as FTS.
SemanticSearchFusion.swift — Reciprocal Rank Fusion (k=60) merging FTS and semantic lists. Hybrid is a strict superset of FTS recall: exact hits are preserved and only reordered; paraphrase-only matches are appended.
TranscriptIndex.swift — additive vector tables + lexical / semantic / hybrid routing on searchUtterances / searchDictationEntries / searchContext. The lexical path is byte-for-byte unchanged (mode defaults to .lexical at the index layer).
search / search_context MCP tools — new optional mode arg, default hybrid.

Graceful degradation: if the embedding backend is unavailable (e.g. missing OS language assets in a headless image), the store is never created and every mode runs lexical-only — no errors.

Local proof (this Mac)

Real NLEmbedding works here: 512-dim, and the headline example ranks correctly — cosine("pricing pushback", "they balked at the cost") = 0.387 vs unrelated = 0.307. Note NLEmbedding's similarity floor is high (~0.30 even for unrelated short sentences), so pure semantic mode is best-effort; hybrid (default) is rank-based and stays robust because exact FTS hits anchor precision. Similarity threshold set to 0.30 to trim the obvious tail. A bundled MiniLM via the provider seam is the quality upgrade path.

Tests

SemanticSearchTests.swift — a deterministic stub provider (concept-axis embeddings) drives: paraphrase-that-lexical-misses, hybrid superset, semantic date-filter, unrelated-concept rejection, dictation semantic, provider-unavailable fallback, no-provider disabled, and model-change re-embed. Plus VectorMath round-trip/normalize/dot and RRF fusion unit tests. Full MCP suite green: 90 tests, 0 failures. (Tests use the stub so they don't depend on OS NLEmbedding assets in CI.)

Conflict / coordination

Touches the same area as the in-flight Moat #1 work (#1323, summary fields into search). Checked the diff: #1323 does not touch TranscriptIndex.swift (my core change). The only overlap is ToolHandlers.swift, and the edits are in different regions (they refactor recap/recent_context; I add mode to search/search_context). Additive, resolvable on rebase — merge-room can sequence these.

Verification run

Per .agents/test-matrix.yml (touched Tools/TranscriptedMCP/**):

swift build ✅
swift test --package-path Tools/TranscriptedMCP ✅ (90 tests)

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Add on-device semantic search to the read-only MCP server so paraphrase queries hit (e.g. "pricing pushback" finds "they balked at the cost"), complementing — not replacing — the existing SQLite FTS5 lexical path. Model choice: Apple's NaturalLanguage NLEmbedding.sentenceEmbedding. It is built into macOS, so there is no bundled model and no download — negligible incremental app size and modest RAM, and it builds in this standalone SwiftPM package with no dependency on the app's MLX/FluidAudio infra. The backend sits behind an EmbeddingProvider protocol so a bundled CoreML model can replace it later without touching the store or search path. - EmbeddingProvider.swift: protocol, NLEmbeddingProvider, SearchMode, VectorMath - EmbeddingStore.swift: vector store on its own SQLite connection; embeds rows lazily after each reconcile, re-embeds on model-id/dimension change, runs a streaming cosine scan with the same speaker/date filters as FTS - SemanticSearchFusion.swift: reciprocal-rank fusion for hybrid search - TranscriptIndex: additive vector tables + lexical/semantic/hybrid routing; the lexical path is unchanged - search / search_context expose `mode` (default hybrid) All modes degrade gracefully: if the embedding backend is unavailable the store is never created and search runs lexical-only. Changes are additive (new tables + new search path), keeping the lexical write path untouched. Tests: deterministic stub provider drives the paraphrase, hybrid-superset, date-filter, dictation, fallback, and model-change cases; plus vector-math and RRF fusion unit tests. Full MCP suite green (90 tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

r3dbars and others added 2 commits June 25, 2026 06:09

docs: Transcripted next-work shortlist

7ace253

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mcp): local semantic + hybrid search over saved transcripts#1326

feat(mcp): local semantic + hybrid search over saved transcripts#1326
r3dbars wants to merge 2 commits into
mainfrom
claude/zealous-cori-f04800

r3dbars commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

r3dbars commented Jun 25, 2026

What

Model choice — lightest viable, zero bundle cost

How it works

Local proof (this Mac)

Tests

Conflict / coordination

Verification run

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant