feat(meeting): always-on cheap field extraction at save time#1327
Closed
r3dbars wants to merge 2 commits into
Closed
feat(meeting): always-on cheap field extraction at save time#1327r3dbars wants to merge 2 commits into
r3dbars wants to merge 2 commits into
Conversation
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The heavy local summarizer (Gemma/Apple beta) only writes Decisions / Action Items / Open Questions when a user opts into the ~12GB path, so the "ask my history" index only ever covered a subset of meetings. Add an always-on, dependency-free heuristic extraction that runs on every meeting save (live + imported) and writes a baseline version of the same logical fields into a parallel `auto_summary_*` frontmatter namespace. - MeetingQuickSummaryExtractor: precision-leaning rule pass over the styled transcript (curated cue lists, sentence-level matching, dedupe, per-section caps) producing a LocalMeetingSummarySections. - MeetingQuickSummaryWriter: idempotent, frontmatter-only writer. Deliberately does NOT reuse the heavy `local_summary_*` keys (those gate Home UI and the "Run AI summary" affordance), so the heavy summarizer stays the high-quality path and overwrites nothing. - Wired into MeetingSessionController.restyleSavedTranscriptInBackground, after restyle, on the chained background task. Coverage of the index fields is now 100% of meetings instead of the beta opt-in subset. Value format mirrors `local_summary_*` exactly so the Moat #1 indexer can fall back to `auto_summary_*` (heavy taking precedence). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Owner
Author
|
Closing as superseded by #1331, which merged the combined ask-meeting-history path covering this draft's scope. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
From
docs/NEXT_WORK.md#5. The heavy local meeting summarizer (LocalMeetingSummarizer.swift— the ~12GB Gemma / Apple beta path) is the only thing that writes the structured fields Decisions / Action Items / Open Questions. It only runs when a user opts into the beta, so the "ask my history" moat (the search index) only ever covered a subset of meetings.This adds an always-on, cheap, dependency-free extraction at save time so the index can cover 100% of meetings (live + imported), not just the beta opt-in. The heavy summarizer stays the high-quality path; this is the baseline that guarantees coverage.
Method
A precision-leaning heuristic pass (no model, no 12GB dependency):
MeetingQuickSummaryExtractorparses the already-styled transcript into speaker turns, splits into sentences, and classifies each into at most one bucket (Decisions › Action Items › Open Questions) via curated cue lists, with small-talk filtering, dedupe, and per-section caps. Action items are owner-prefixed from the speaker label. Produces aLocalMeetingSummarySections(same shape the heavy path uses).MeetingQuickSummaryWriterwrites those into the saved transcript's YAML frontmatter under a parallelauto_summary_*namespace. It's idempotent (skips onceauto_summary_versionis set), skips non-meetings, and is frontmatter-only (transcript body preserved verbatim).Why a separate
auto_summary_*namespace (notlocal_summary_*)The heavy summarizer's
local_summary_*keys also drive Home UI gating — the inline summary card and the "Run AI summary" affordance key offlocal_summary_version. Reusing them would make the cheap heuristic masquerade as the AI summary and hide the upgrade affordance. The parallel namespace means: the index gets the same logical fields for every meeting, the heavy summarizer overwrites nothing, and existing Home/Settings UI is untouched.Wiring
Runs in
MeetingSessionController.restyleSavedTranscriptInBackground, immediately afterrestyleTranscripton the same chained detached task — so it's off the main actor, never races the next restyle, and the body is already in canonical styled form. Covers both live captures and imported audio (both flow throughtaskManager.lastSavedTranscriptURL).Sequence vs Moat #1 (index summary fields)
Moat #1 is not merged in this branch. This PR builds to the same field shape so #1's indexer can consume it directly. Field keys + value format (bullet lines flattened with
" | ", mirroringlocal_summary_*):auto_summary_version,auto_summary_generated_at,auto_summary_method(heuristic-v1),auto_summary_participants,auto_summary,auto_summary_decisions,auto_summary_action_items,auto_summary_open_questions,auto_summary_risks_or_followups,auto_summary_accuracy_notes.Index precedence the #1 reader should adopt: prefer
local_summary_*when present (heavy, higher quality), else fall back toauto_summary_*. Until #1 (or #4's cross-meeting tools) reads these keys, the fields are written but not yet queried — that's the intended ordering, not a regression.Tests
Tests/MeetingQuickSummaryExtractorTests.swift(registered inTests/FastTests.manifest+run-tests.shAPP_SOURCES): owner-prefixed action items, decision/action de-conflation, substantive vs rhetorical questions, tiny/empty input, inline transcript form, the frontmatter writer (keys present, body preserved, nolocal_summary_*leakage), and idempotency / non-meeting skip.bash build.sh --no-open✅bash run-tests.sh✅ (10177/10177)bash run-integration-smoke.sh✅🤖 Generated with Claude Code