perf: agent-runtime hot-path hardening (behavior-preserving) by nyo16 · Pull Request #62 · nyo16/nous

nyo16 · 2026-06-19T20:13:06Z

Summary

Eliminates confirmed super-linear and serialization hot paths in the agent
runtime — core loop, persistence, team coordination, and memory search.
Every change is behavior-preserving (provider payloads, search results,
and claim semantics are byte/semantically identical) and was measured first
with Benchee before refactoring.

Full suite green: 1896 passed, 0 failed, stable across 3 seeds.
mix compile --warnings-as-errors, mix format --check-formatted, and
mix credo (372 files) all clean.

Changes

Core loop — tool-schema conversion cache

Tool→provider schema conversion is now memoized once per run via a
runtime-only Context.tool_schema_cache keyed on {provider, tool-name set},
re-converting only when the set changes. Anthropic conversion alone was
~12.6µs + ~90KB allocated every iteration on an otherwise-static tool set.
Stripped from Context.serialize/1 (derived, never persisted).
Gated out: the incremental message-marshalling cache — Phase-0 bench showed
message marshalling is <4µs even at 100 messages (≪1ms), not worth the
stale-history-to-LLM risk.

Persistence / OTP

agent_server: context save on the response and clear_history paths is
now fire-and-forget via Task.Supervisor (off the GenServer mailbox) so a
slow backend can't block the loop; the explicit :save_context call stays
synchronous. Ordering tradeoff documented in code.
teams/rate_limiter: running window counters replace the two per-acquire
O(n) folds — rate_limited?/2 is now O(1).
teams/shared_state: ETS row-per-entry for discoveries and claims
replaces a single growing list term that was copied on every :ets.insert;
claim conflict checks use a file-scoped matchspec; release/expire are O(1)
key deletes; dedup is automatic via the {:claim, agent, file} key.

Context updates

context_update_to_map + public ContextUpdate.apply/2: the O(n²)
existing ++ [item] appends are replaced with prepend + per-key reverse, with
reversal-tracking that preserves exact insertion order (including the
set [list] → append case).

Memory / search

memory/store/ets, knowledge_base/store/ets, decisions/store/ets:
push scope / kb_id / type+status filters into ETS via partial-map
matchspecs instead of tab2list-copying the whole table then filtering in
Elixir.
normalize_relevance (single reduce for the max, no intermediate list).
memory/store/sqlite: query L2 norm hoisted out of the cosine loop
(computed once, not per candidate).

Benchmarks (dev env, M4 Max)

Path	Before	After
ETS scoped search @ 10k entries	17.5 ms	6.47 ms (2.7×)
ETS scoped search @ 1k entries	1.22 ms	0.62 ms (~2×)
Tool conversion / iteration (Anthropic, ×20)	every iteration	once per run
Cosine 2k×768 (qnorm hoisted + stored norms, isolated)	69 ms	32 ms (2.16×)

New dev-only scripts: bench/marshalling_bench.exs, bench/memory_search_bench.exs.

Tests

New test/nous/tool/context_update_test.exs — 9 ordering cases + differential
check vs a reference ++ implementation.
agent_server: slow-backend non-blocking test + async-save awaits.
rate_limiter: window-counter invariant + prune-subtract tests.
shared_state: 2 concurrency tests (race for one region → exactly one wins;
non-overlapping concurrent claims all succeed).
agent_runner: tool-cache golden-master (cached iteration payload ==
uncached iteration payload).

Deferred (documented)

SQLite stored-norm column (precompute candidate L2 norms at insert):
needs an Exqlite schema migration + backfill; exqlite is an optional dep and
is the stub in this build, so it's unverifiable here. The query-norm hoist
(schema-free) is in.
SharedState :public/lock-free direct reads: get_discoveries/
get_claims have no non-test callers and aren't hot; direct reads would need
an API change. Kept the (pid) API + :private table.
Decisions edge-direction reads and KB/Decisions backlink/outlink secondary
index — not exercised by the baseline; graph sizes bounded.

Risk / rollback

Each phase is independent. The tool-schema cache is the only stateful addition
and falls back to per-iteration conversion if the field is absent. No public
API or schema changes.

Eliminate confirmed super-linear and serialization hot paths in the agent runtime, measured with Benchee first (bench/marshalling_bench.exs, bench/memory_search_bench.exs). All changes preserve observable behavior; full suite green (1896 passed, stable across seeds). Core loop - Tool-schema conversion memoized once per run via a runtime-only Context.tool_schema_cache keyed on {provider, tool-name set} (Anthropic conversion was ~12.6us + ~90KB allocated every iteration on a static set). Stripped from Context.serialize/1. - Phase-0 bench showed per-iteration message marshalling is <4us even at 100 msgs, so the message-marshalling cache was gated out (not worth the stale-history risk). Persistence / OTP - agent_server: context save on the response + clear_history paths is now fire-and-forget via Task.Supervisor (off the GenServer mailbox); the explicit :save_context call stays synchronous. - teams/rate_limiter: running window counters replace per-acquire O(n) folds; rate_limited?/2 is now O(1). - teams/shared_state: ETS row-per-entry (discoveries + claims) replaces a single growing list term copied on every insert; claim conflict checks use a file-scoped matchspec; release/expire are O(1) deletes. Context updates - context_update_to_map + ContextUpdate.apply: O(n^2) `++ [item]` appends replaced with prepend + per-key reverse (reversal-tracking preserves exact order, including set-list-then-append). Memory / search - memory/store/ets + knowledge_base + decisions stores: push scope/kb_id/type filters into ETS via partial-map matchspecs instead of tab2list-copying the whole table (scoped search ~2.7x faster at 10k entries). - memory/search: single-pass filter + folded normalize_relevance. - memory/store/sqlite: query L2 norm hoisted out of the cosine loop. Tests: +context_update_test (ordering), slow-backend non-blocking + async-save tests, rate-limiter window-invariant + prune, shared_state concurrency, tool-cache golden master.

The audit content for releases 0.16.2-0.16.5 had piled up under [Unreleased] and was never versioned (releases were tagged without splitting the changelog). Split it into dated sections using the tag-snapshot delta as ground truth (the [Unreleased] content at each tag = that release's cumulative content): - [0.16.2] 2026-05-16 — provider marshalling, ETS lifecycle, OTP hygiene, telemetry (#58) - [0.16.3] 2026-05-29 — security pass: RCE gate, SSRF, sandbox + audit findings (#59) - [0.16.4] 2026-06-05 — security/OTP/test hardening (#60); summarized from the commit, since this release added nothing to the changelog at the time - [0.16.5] 2026-06-12 — InputGuard fail-closed, permissive execute gate, policy bypass (#61) - [Unreleased] (-> 0.16.6) — perf hot-path hardening (#62) + the docs overhaul (#63) Bullets were moved verbatim, not rewritten; dates match the git tags exactly. Added compare-links for each new version. Verified: no [Unreleased] bullet was dropped, mix docs is 0 warnings, mix format clean.

@Version

…#64) * docs: follow-ups — CHANGELOG entries, silent mix docs, 4 new examples Post-overhaul cleanup (follow-up to #63). CHANGELOG: - Add the missing 0.12.13 (custom: provider, #34) and 0.12.12 (memory/context/ AgentServer fixes) entries — both had release tags but no changelog sections — with matching compare-links. mix docs — now 0 warnings: - Qualify `Agent.new/2` -> `Nous.Agent.new/2` (resolves & links) in CHANGELOG. - De-link historical references to since-private/hidden APIs in CHANGELOG and AGENTS.md (run_with_tools/6, Gemini.parse_content/1, Model.default_receive_timeout/1, Provider.request/3, Plugins.Memory.init/2, Nous.Application, Persistence.ETS.TableOwner). New examples (all run or degrade gracefully without a provider): - examples/llm_oneshot.exs — bare Nous.LLM API (generate_text/3, /3 bang, stream_text/3). - examples/knowledge_base.exs — KB store add/search + KB agent plugin. - examples/advanced/summarization.exs — auto-compaction via the Summarization plugin. - examples/advanced/web_tools.exs — WebFetch/SearchScrape/Tavily/Brave tools. - Indexed all four in examples/README.md. Docs: - Clarify the three LiveView examples' distinct roles (patterns reference vs complete chat app vs multi-agent dashboard). Verified: mix format, compile --warnings-as-errors, docs (0 warnings), credo --strict (clean); new examples run green offline. Note: v0.16.2-v0.16.5 are tagged but lack CHANGELOG entries and mix.exs @Version (0.16.1) is behind the latest tag — deferred to a separate version/release reconciliation pass rather than guessing release notes. * chore: bump @Version to 0.16.6 mix.exs @Version had been left at 0.16.1 across the v0.16.2-v0.16.5 release tags (each tagged without bumping it). master is 3 commits past v0.16.5, so the next release is 0.16.6. Note: the matching CHANGELOG entries for 0.16.2-0.16.5 are still outstanding (separate version/release reconciliation). * docs(CHANGELOG): split accumulated [Unreleased] into 0.16.2-0.16.5 The audit content for releases 0.16.2-0.16.5 had piled up under [Unreleased] and was never versioned (releases were tagged without splitting the changelog). Split it into dated sections using the tag-snapshot delta as ground truth (the [Unreleased] content at each tag = that release's cumulative content): - [0.16.2] 2026-05-16 — provider marshalling, ETS lifecycle, OTP hygiene, telemetry (#58) - [0.16.3] 2026-05-29 — security pass: RCE gate, SSRF, sandbox + audit findings (#59) - [0.16.4] 2026-06-05 — security/OTP/test hardening (#60); summarized from the commit, since this release added nothing to the changelog at the time - [0.16.5] 2026-06-12 — InputGuard fail-closed, permissive execute gate, policy bypass (#61) - [Unreleased] (-> 0.16.6) — perf hot-path hardening (#62) + the docs overhaul (#63) Bullets were moved verbatim, not rewritten; dates match the git tags exactly. Added compare-links for each new version. Verified: no [Unreleased] bullet was dropped, mix docs is 0 warnings, mix format clean.

nyo16 merged commit 35603b1 into master Jun 19, 2026
6 checks passed

nyo16 deleted the perf/agent-runtime-hardening branch June 19, 2026 20:18

nyo16 mentioned this pull request Jun 22, 2026

docs: follow-ups (CHANGELOG entries, silent mix docs, 4 new examples) #64

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: agent-runtime hot-path hardening (behavior-preserving)#62

perf: agent-runtime hot-path hardening (behavior-preserving)#62
nyo16 merged 1 commit into
masterfrom
perf/agent-runtime-hardening

nyo16 commented Jun 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nyo16 commented Jun 19, 2026

Summary

Changes

Core loop — tool-schema conversion cache

Persistence / OTP

Context updates

Memory / search

Benchmarks (dev env, M4 Max)

Tests

Deferred (documented)

Risk / rollback

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant