cache binaries that have nothing to instrument#325
Open
gecube wants to merge 1 commit into
Open
Conversation
AcquireGlobalUprobe only cached successful attachments. When attach() returned no links (e.g. a stripped Go binary such as kubectl, which has no symbol table), nothing was remembered, so the binary was re-opened and its symbols re-parsed for every new process that ran it. On nodes that run such binaries frequently this is pure overhead. Cache the negative result in a separate map keyed by dev+inode. The binary's size and mtime are stored in the entry and re-checked on every lookup, so a binary replaced in place or whose inode is recycled is re-evaluated rather than served a stale verdict. Entries expire after a TTL; expired ones are swept on insertion, but at most once per TTL, so a burst of new binaries does not turn every insertion into a full-map scan under the lock. Live (refcounted) attachments are unchanged.
d1bc865 to
3bbab9f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
This follows up on #316 / #324. #324 stops the agent from logging an error for stripped Go binaries (
failed to get symbol crypto/tls.(*Conn).Write: no symbols found). This PR addresses the underlying inefficiency that produced that log line on every process in the first place.Problem
AcquireGlobalUprobecaches only successful attachments:When
attach()finds nothing to instrument — most commonly a stripped Go binary such askubectl(built with-ldflags "-s -w", so no symbol table) — the negative result is discarded. The per-processgoTlsUprobesCheckedguard only suppresses re-checks within a single process, so every new PID of the same binary re-opens the ELF and re-parses its symbol tables only to fail again. On nodes where such binaries run frequently (CI runners, operators invokingkubectl/helmin loops, cron) this is steady, pointless CPU and I/O — and, before #324, steady log noise.(The work that precedes
AcquireGlobalUprobe— reading build info,readlink, the version check — still runs per PID; it is needed to classify the process as a Go app and to resolve the cache key. The symbol-table parse insideattach()is the dominant cost and is what this cache eliminates.)Why not hash the binary
The obvious correctness worry with caching a negative result is inode reuse: a file is deleted and its inode number is later reused by a different, possibly instrumentable, binary. A content hash would detect that, but hashing a 50–100 MB binary on every process spawn means reading the whole file from disk — far more expensive than the symbol parse we are trying to avoid. It trades a cheap operation for an expensive one.
(For the existing positive cache this race is effectively impossible: the entry is refcounted and held by live processes, so the inode cannot be freed and reused while the entry exists. It only matters once we start caching negatives, which nothing holds.)
Approach
Negatives live in their own map, separate from the refcounted positive cache, keyed by
dev+inode. Thesizeandmtimefrom thestatthe function already performs are stored in the entry and re-checked on every lookup: a binary replaced in place, or one whose inode was recycled, almost always differs in size or mtime, so it is re-evaluated rather than served a stale verdict. This costs no extra I/O.Keying by
dev+inode(rather than folding size+mtime into the key) keeps the map bounded by the number of distinct inodes seen: a path that is repeatedly overwritten in place reuses a single entry instead of accumulating one per spawn.size+mtimeis a heuristic, not tamper-proof. A replacement that preserves both within the filesystem's timestamp granularity (cp -p,tar -xp, a snapshot restore) is served the stale verdict until the entry's TTL elapses; the TTL bounds that window. Deliberate tampering that preserves size and mtime is out of scope.Expired entries are swept when a new negative entry is added, but at most once per TTL, so a burst of distinct stripped binaries does not turn every insertion into a full-map scan under the lock. Stale entries are additionally dropped lazily on lookup. Live (refcounted) attachments are never touched by the sweep.
Genuine attachments are unchanged.
Tests
ebpftracer/tracer_cache_test.gocovers: an empty result is parsed once and then cached; a replaced binary (changed size/mtime) is re-evaluated; repeated in-place overwrite keeps a single cache entry; an expired negative entry is re-evaluated and replaced rather than duplicated; expired entries are swept while live entries survive; and concurrent acquisition of the same binary is race-clean. Run withgo test -race ./ebpftracer/.