perf: agent-runtime hot-path hardening (behavior-preserving)#62
Merged
Conversation
Eliminate confirmed super-linear and serialization hot paths in the agent
runtime, measured with Benchee first (bench/marshalling_bench.exs,
bench/memory_search_bench.exs). All changes preserve observable behavior;
full suite green (1896 passed, stable across seeds).
Core loop
- Tool-schema conversion memoized once per run via a runtime-only
Context.tool_schema_cache keyed on {provider, tool-name set} (Anthropic
conversion was ~12.6us + ~90KB allocated every iteration on a static set).
Stripped from Context.serialize/1.
- Phase-0 bench showed per-iteration message marshalling is <4us even at 100
msgs, so the message-marshalling cache was gated out (not worth the
stale-history risk).
Persistence / OTP
- agent_server: context save on the response + clear_history paths is now
fire-and-forget via Task.Supervisor (off the GenServer mailbox); the
explicit :save_context call stays synchronous.
- teams/rate_limiter: running window counters replace per-acquire O(n) folds;
rate_limited?/2 is now O(1).
- teams/shared_state: ETS row-per-entry (discoveries + claims) replaces a
single growing list term copied on every insert; claim conflict checks use
a file-scoped matchspec; release/expire are O(1) deletes.
Context updates
- context_update_to_map + ContextUpdate.apply: O(n^2) `++ [item]` appends
replaced with prepend + per-key reverse (reversal-tracking preserves exact
order, including set-list-then-append).
Memory / search
- memory/store/ets + knowledge_base + decisions stores: push scope/kb_id/type
filters into ETS via partial-map matchspecs instead of tab2list-copying the
whole table (scoped search ~2.7x faster at 10k entries).
- memory/search: single-pass filter + folded normalize_relevance.
- memory/store/sqlite: query L2 norm hoisted out of the cosine loop.
Tests: +context_update_test (ordering), slow-backend non-blocking +
async-save tests, rate-limiter window-invariant + prune, shared_state
concurrency, tool-cache golden master.
nyo16
added a commit
that referenced
this pull request
Jun 22, 2026
The audit content for releases 0.16.2-0.16.5 had piled up under [Unreleased] and was never versioned (releases were tagged without splitting the changelog). Split it into dated sections using the tag-snapshot delta as ground truth (the [Unreleased] content at each tag = that release's cumulative content): - [0.16.2] 2026-05-16 — provider marshalling, ETS lifecycle, OTP hygiene, telemetry (#58) - [0.16.3] 2026-05-29 — security pass: RCE gate, SSRF, sandbox + audit findings (#59) - [0.16.4] 2026-06-05 — security/OTP/test hardening (#60); summarized from the commit, since this release added nothing to the changelog at the time - [0.16.5] 2026-06-12 — InputGuard fail-closed, permissive execute gate, policy bypass (#61) - [Unreleased] (-> 0.16.6) — perf hot-path hardening (#62) + the docs overhaul (#63) Bullets were moved verbatim, not rewritten; dates match the git tags exactly. Added compare-links for each new version. Verified: no [Unreleased] bullet was dropped, mix docs is 0 warnings, mix format clean.
nyo16
added a commit
that referenced
this pull request
Jun 22, 2026
The audit content for releases 0.16.2-0.16.5 had piled up under [Unreleased] and was never versioned (releases were tagged without splitting the changelog). Split it into dated sections using the tag-snapshot delta as ground truth (the [Unreleased] content at each tag = that release's cumulative content): - [0.16.2] 2026-05-16 — provider marshalling, ETS lifecycle, OTP hygiene, telemetry (#58) - [0.16.3] 2026-05-29 — security pass: RCE gate, SSRF, sandbox + audit findings (#59) - [0.16.4] 2026-06-05 — security/OTP/test hardening (#60); summarized from the commit, since this release added nothing to the changelog at the time - [0.16.5] 2026-06-12 — InputGuard fail-closed, permissive execute gate, policy bypass (#61) - [Unreleased] (-> 0.16.6) — perf hot-path hardening (#62) + the docs overhaul (#63) Bullets were moved verbatim, not rewritten; dates match the git tags exactly. Added compare-links for each new version. Verified: no [Unreleased] bullet was dropped, mix docs is 0 warnings, mix format clean.
nyo16
added a commit
that referenced
this pull request
Jun 22, 2026
…#64) * docs: follow-ups — CHANGELOG entries, silent mix docs, 4 new examples Post-overhaul cleanup (follow-up to #63). CHANGELOG: - Add the missing 0.12.13 (custom: provider, #34) and 0.12.12 (memory/context/ AgentServer fixes) entries — both had release tags but no changelog sections — with matching compare-links. mix docs — now 0 warnings: - Qualify `Agent.new/2` -> `Nous.Agent.new/2` (resolves & links) in CHANGELOG. - De-link historical references to since-private/hidden APIs in CHANGELOG and AGENTS.md (run_with_tools/6, Gemini.parse_content/1, Model.default_receive_timeout/1, Provider.request/3, Plugins.Memory.init/2, Nous.Application, Persistence.ETS.TableOwner). New examples (all run or degrade gracefully without a provider): - examples/llm_oneshot.exs — bare Nous.LLM API (generate_text/3, /3 bang, stream_text/3). - examples/knowledge_base.exs — KB store add/search + KB agent plugin. - examples/advanced/summarization.exs — auto-compaction via the Summarization plugin. - examples/advanced/web_tools.exs — WebFetch/SearchScrape/Tavily/Brave tools. - Indexed all four in examples/README.md. Docs: - Clarify the three LiveView examples' distinct roles (patterns reference vs complete chat app vs multi-agent dashboard). Verified: mix format, compile --warnings-as-errors, docs (0 warnings), credo --strict (clean); new examples run green offline. Note: v0.16.2-v0.16.5 are tagged but lack CHANGELOG entries and mix.exs @Version (0.16.1) is behind the latest tag — deferred to a separate version/release reconciliation pass rather than guessing release notes. * chore: bump @Version to 0.16.6 mix.exs @Version had been left at 0.16.1 across the v0.16.2-v0.16.5 release tags (each tagged without bumping it). master is 3 commits past v0.16.5, so the next release is 0.16.6. Note: the matching CHANGELOG entries for 0.16.2-0.16.5 are still outstanding (separate version/release reconciliation). * docs(CHANGELOG): split accumulated [Unreleased] into 0.16.2-0.16.5 The audit content for releases 0.16.2-0.16.5 had piled up under [Unreleased] and was never versioned (releases were tagged without splitting the changelog). Split it into dated sections using the tag-snapshot delta as ground truth (the [Unreleased] content at each tag = that release's cumulative content): - [0.16.2] 2026-05-16 — provider marshalling, ETS lifecycle, OTP hygiene, telemetry (#58) - [0.16.3] 2026-05-29 — security pass: RCE gate, SSRF, sandbox + audit findings (#59) - [0.16.4] 2026-06-05 — security/OTP/test hardening (#60); summarized from the commit, since this release added nothing to the changelog at the time - [0.16.5] 2026-06-12 — InputGuard fail-closed, permissive execute gate, policy bypass (#61) - [Unreleased] (-> 0.16.6) — perf hot-path hardening (#62) + the docs overhaul (#63) Bullets were moved verbatim, not rewritten; dates match the git tags exactly. Added compare-links for each new version. Verified: no [Unreleased] bullet was dropped, mix docs is 0 warnings, mix format clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Eliminates confirmed super-linear and serialization hot paths in the agent
runtime — core loop, persistence, team coordination, and memory search.
Every change is behavior-preserving (provider payloads, search results,
and claim semantics are byte/semantically identical) and was measured first
with Benchee before refactoring.
Full suite green: 1896 passed, 0 failed, stable across 3 seeds.
mix compile --warnings-as-errors,mix format --check-formatted, andmix credo(372 files) all clean.Changes
Core loop — tool-schema conversion cache
runtime-only
Context.tool_schema_cachekeyed on{provider, tool-name set},re-converting only when the set changes. Anthropic conversion alone was
~12.6µs + ~90KB allocated every iteration on an otherwise-static tool set.
Context.serialize/1(derived, never persisted).message marshalling is
<4µseven at 100 messages (≪1ms), not worth thestale-history-to-LLM risk.
Persistence / OTP
agent_server: context save on the response andclear_historypaths isnow fire-and-forget via
Task.Supervisor(off the GenServer mailbox) so aslow backend can't block the loop; the explicit
:save_contextcall stayssynchronous. Ordering tradeoff documented in code.
teams/rate_limiter: running window counters replace the two per-acquireO(n) folds —
rate_limited?/2is now O(1).teams/shared_state: ETS row-per-entry for discoveries and claimsreplaces a single growing list term that was copied on every
:ets.insert;claim conflict checks use a file-scoped matchspec; release/expire are O(1)
key deletes; dedup is automatic via the
{:claim, agent, file}key.Context updates
context_update_to_map+ publicContextUpdate.apply/2: the O(n²)existing ++ [item]appends are replaced with prepend + per-key reverse, withreversal-tracking that preserves exact insertion order (including the
set [list]→appendcase).Memory / search
memory/store/ets,knowledge_base/store/ets,decisions/store/ets:push
scope/kb_id/type+statusfilters into ETS via partial-mapmatchspecs instead of
tab2list-copying the whole table then filtering inElixir.
normalize_relevance(single reduce for the max, no intermediate list).memory/store/sqlite: query L2 norm hoisted out of the cosine loop(computed once, not per candidate).
Benchmarks (dev env, M4 Max)
New dev-only scripts:
bench/marshalling_bench.exs,bench/memory_search_bench.exs.Tests
test/nous/tool/context_update_test.exs— 9 ordering cases + differentialcheck vs a reference
++implementation.agent_server: slow-backend non-blocking test + async-save awaits.rate_limiter: window-counter invariant + prune-subtract tests.shared_state: 2 concurrency tests (race for one region → exactly one wins;non-overlapping concurrent claims all succeed).
agent_runner: tool-cache golden-master (cached iteration payload ==uncached iteration payload).
Deferred (documented)
needs an Exqlite schema migration + backfill;
exqliteis an optional dep andis the stub in this build, so it's unverifiable here. The query-norm hoist
(schema-free) is in.
SharedState:public/lock-free direct reads:get_discoveries/get_claimshave no non-test callers and aren't hot; direct reads would needan API change. Kept the
(pid)API +:privatetable.index — not exercised by the baseline; graph sizes bounded.
Risk / rollback
Each phase is independent. The tool-schema cache is the only stateful addition
and falls back to per-iteration conversion if the field is absent. No public
API or schema changes.