feat(mcp): local semantic + hybrid search over saved transcripts#1326
Draft
r3dbars wants to merge 2 commits into
Draft
feat(mcp): local semantic + hybrid search over saved transcripts#1326r3dbars wants to merge 2 commits into
r3dbars wants to merge 2 commits into
Conversation
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add on-device semantic search to the read-only MCP server so paraphrase queries hit (e.g. "pricing pushback" finds "they balked at the cost"), complementing — not replacing — the existing SQLite FTS5 lexical path. Model choice: Apple's NaturalLanguage NLEmbedding.sentenceEmbedding. It is built into macOS, so there is no bundled model and no download — negligible incremental app size and modest RAM, and it builds in this standalone SwiftPM package with no dependency on the app's MLX/FluidAudio infra. The backend sits behind an EmbeddingProvider protocol so a bundled CoreML model can replace it later without touching the store or search path. - EmbeddingProvider.swift: protocol, NLEmbeddingProvider, SearchMode, VectorMath - EmbeddingStore.swift: vector store on its own SQLite connection; embeds rows lazily after each reconcile, re-embeds on model-id/dimension change, runs a streaming cosine scan with the same speaker/date filters as FTS - SemanticSearchFusion.swift: reciprocal-rank fusion for hybrid search - TranscriptIndex: additive vector tables + lexical/semantic/hybrid routing; the lexical path is unchanged - search / search_context expose `mode` (default hybrid) All modes degrade gracefully: if the embedding backend is unavailable the store is never created and search runs lexical-only. Changes are additive (new tables + new search path), keeping the lexical write path untouched. Tests: deterministic stub provider drives the paraphrase, hybrid-superset, date-filter, dictation, fallback, and model-change cases; plus vector-math and RRF fusion unit tests. Full MCP suite green (90 tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds local, on-device semantic search to the read-only MCP server so paraphrase queries hit — e.g.
search("pricing pushback")now finds "they balked at the cost". Today search is SQLite FTS5 lexical-only (TranscriptIndex.swift). This complements FTS, it does not replace it.This is NEXT_WORK.md item #6 (L). The one thing cloud RAG still beat us on.
Model choice — lightest viable, zero bundle cost
The search/index lives in the standalone
Tools/TranscriptedMCPSwiftPM package, which intentionally has no dependency on the app's MLX/FluidAudio model infra and builds via plainswift build. So I picked the lightest viable on-device option:Apple
NaturalLanguage—NLEmbedding.sentenceEmbedding(for: .english)(512-dim).import NaturalLanguage— no prebuilt frameworks, no CoreML packaging, no tokenizer to ship, no CI surface.EmbeddingProviderprotocol, so a bundled CoreML model (e.g. quantized MiniLM) can replace it later for higher quality without touching the store or search path.How it works
EmbeddingProvider.swift— protocol,NLEmbeddingProvider,SearchMode,VectorMath(normalize / cosine / blob).EmbeddingStore.swift— vector store on its own SQLite connection to the samemcp_index.sqlite. Writes only additive tables (embedding_meta,utterance_vectors,dictation_entry_vectors), reads the lexical tables. Embeds new/changed rows lazily after each reconcile, drops orphaned vectors, re-embeds everything on a model-id/dimension change, and runs a streaming cosine scan honoring the same speaker/date filters as FTS.SemanticSearchFusion.swift— Reciprocal Rank Fusion (k=60) merging FTS and semantic lists. Hybrid is a strict superset of FTS recall: exact hits are preserved and only reordered; paraphrase-only matches are appended.TranscriptIndex.swift— additive vector tables +lexical/semantic/hybridrouting onsearchUtterances/searchDictationEntries/searchContext. The lexical path is byte-for-byte unchanged (mode defaults to.lexicalat the index layer).search/search_contextMCP tools — new optionalmodearg, defaulthybrid.Graceful degradation: if the embedding backend is unavailable (e.g. missing OS language assets in a headless image), the store is never created and every mode runs lexical-only — no errors.
Local proof (this Mac)
Real
NLEmbeddingworks here: 512-dim, and the headline example ranks correctly — cosine("pricing pushback", "they balked at the cost") = 0.387 vs unrelated = 0.307. Note NLEmbedding's similarity floor is high (~0.30 even for unrelated short sentences), so puresemanticmode is best-effort;hybrid(default) is rank-based and stays robust because exact FTS hits anchor precision. Similarity threshold set to 0.30 to trim the obvious tail. A bundled MiniLM via the provider seam is the quality upgrade path.Tests
SemanticSearchTests.swift— a deterministic stub provider (concept-axis embeddings) drives: paraphrase-that-lexical-misses, hybrid superset, semantic date-filter, unrelated-concept rejection, dictation semantic, provider-unavailable fallback, no-provider disabled, and model-change re-embed. PlusVectorMathround-trip/normalize/dot and RRF fusion unit tests. Full MCP suite green: 90 tests, 0 failures. (Tests use the stub so they don't depend on OS NLEmbedding assets in CI.)Conflict / coordination
Touches the same area as the in-flight Moat #1 work (#1323, summary fields into search). Checked the diff: #1323 does not touch
TranscriptIndex.swift(my core change). The only overlap isToolHandlers.swift, and the edits are in different regions (they refactor recap/recent_context; I addmodeto search/search_context). Additive, resolvable on rebase — merge-room can sequence these.Verification run
Per
.agents/test-matrix.yml(touchedTools/TranscriptedMCP/**):swift build✅swift test --package-path Tools/TranscriptedMCP✅ (90 tests)🤖 Generated with Claude Code