Skip to content

Linux support for file-mapped binary analysis (PE/Mach-O)#14

Merged
praydog merged 17 commits into
mainfrom
linux-support
Jun 19, 2026
Merged

Linux support for file-mapped binary analysis (PE/Mach-O)#14
praydog merged 17 commits into
mainfrom
linux-support

Conversation

@praydog

@praydog praydog commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

Summary

Builds and runs kananlib on Linux (clang) for its file-mapped binary analysis path. You can map a PE32/PE32+ image or x86-64 Mach-O from disk with utility::map_view_of_file and run the scanners, disassembly, emulation, import/export parsing, and patch / pointer-hook / vtable-hook primitives against the mapped image. For x64 PE images, .pdata function discovery and MSVC RTTI vtable lookup are also parsed directly from the mapped image. The CLI works unchanged. Windows behavior and ABI are untouched.

This PR is intentionally scoped to binary analysis on file-mapped images. It does not add ELF support and does not implement Linux live-process introspection (see Non-goals).

What works on Linux

  • map_view_of_file / map_view_of_pe lay PE32/PE32+ sections out at their RVAs via mmap and apply base relocations where representable, so the mapped image behaves like an analysis module (RVA == offset-from-base, absolute pointers valid when the relocation type can represent the mapped host address).
  • Pattern/string scanning, displacement/relative references, disassembly (bddisasm), exhaustive decode / CFG walks, import/export tables, emulation (bdshemu), and patch / pointer-hook / vtable-hook on mapped memory.
  • x64 PE-specific analysis: AVX2 scan fast path, .pdata function-start and bounds discovery, and MSVC RTTI vtable lookup.
  • PE32 static-analysis mapping: PE32 optional headers are parsed correctly and a committed 32-bit sample image maps and scans on Linux. PE32 HIGHLOW relocations fail loudly if the mapped host address cannot be represented in 32 bits.

Windows-only (degrade to no-ops on Linux)

Features that introspect the live process or OS: loader-module walk (get_executable, GetModuleHandle), PDB/DIA symbol resolution, the registry, keyboard state, and thread suspension.

How

  • include/compat/: a Win32 shim used only on non-Windows (PE/COFF structs, memory/module/registry/toolhelp decls, ppl/intrin/vcruntime/rttidata).
  • src/compat/WinCompat.cpp: mmap/mprotect-backed VirtualAlloc/VirtualProtect/VirtualQuery, /proc/self/maps-backed IsBad*Ptr, and a SIGSEGV/SIGBUS + sigsetjmp emulation of the scanners structured-exception guards.
  • utility/Seh.hpp: portable KANANLIB_SEH_TRY / KANANLIB_AV_TRY guards (native SEH / C++ try-catch on Windows, signal-based on Linux).
  • Module.cpp: Linux map_view_of_pe maps PE32/PE32+ images, copies sections to their RVAs, and applies base relocations where safe; PEB/loader paths are guarded to Windows.
  • RTTI.cpp: reads the MSVC TypeDescriptor directly for mapped x64 PE images (the host std::type_info ABI differs), range-gated and fault-safe.
  • PDB.cpp is compiled Windows-only (DIA SDK / urlmon).
  • cmake.toml: conditional shlwapi/threads and a compat include path on non-Windows (cmkr-regenerated).

Non-goals (future, separate branches)

  • ELF support (different format, .eh_frame / Itanium RTTI; would reuse this mapping/scan infrastructure).
  • Linux-native live-process primitives: module enumeration (dl_iterate_phdr / /proc/*/maps) and real thread suspension (tgkill / ptrace).
  • Portable PDB/DIA replacement.

Testing

  • Linux (clang 18): 15 test suites, 0 failures. Coverage includes the committed MSVC-built PE32+ sample DLL, a committed contrived PE32 sample DLL (test/samples/kananlib_sample32.dll), Linux audit regressions, and the new cross-platform kananlib-scan-stress-test.
  • AVX2 proof: kananlib-scan-stress-test runs a 1GB scan_relative_reference stress case with 512 random alignments, then explicitly checks scalar and byte-by-byte agreement. Test-only instrumentation records the public dispatcher path; observed Linux and Windows runs both printed Dispatcher implementation: AVX2.
  • Windows (MSVC, VS 2022, DIA enabled): full suite still green, 24 suites / 0 failures. The scan stress moved out of the Windows-only kananlib-test target into the shared target, so Windows still runs it and Linux now runs it too.

Known limitations on Linux

  • RTTI matches by the decorated type name (.?AV...); there is no MSVC name undecorator, so undecorated-name lookups are unavailable.
  • PE32 HIGHLOW relocations can only be applied when the mapped base fits in a 32-bit VA. If mmap returns an address above 4GB for a PE32 image that needs HIGHLOW, mapping fails loudly rather than wrapping pointers.
  • The mapper applies base relocations (Windows SEC_IMAGE does not), so mapped bytes at relocated sites differ from the on-disk file.
  • The SIGSEGV/SIGBUS handler is process-global; embedding in a Linux app that installs its own handler should chain it (relevant only to the future live-process work).

See docs/LINUX_PORT_FOLLOWUPS.md for intentionally out-of-scope follow-ups tracked for future branches.

praydog added 2 commits June 18, 2026 21:56
Port kananlib to build and run on Linux with clang. The core capability --
map a PE (or x86-64 Mach-O) from disk with map_view_of_file and run the scan,
disassembly, RTTI, and module utilities on the mapped image -- works on Linux.
Live-process / OS-introspection features (loader walk, PDB/DIA, registry,
input, thread suspension) degrade to no-ops.

- include/compat/: Win32 shim used only on non-Windows -- PE/COFF structs,
  memory/module/registry/toolhelp decls, ppl/intrin/vcruntime/rttidata.
- src/compat/WinCompat.cpp: mmap/mprotect-backed VirtualAlloc/Protect/Query,
  /proc/self/maps-backed IsBad*Ptr, SIGSEGV/SIGBUS-based SEH emulation.
- utility/Seh.hpp: portable KANANLIB_SEH_TRY / KANANLIB_AV_TRY guards
  (native SEH / C++ try-catch on Windows, signal-based on Linux).
- Module.cpp: Linux map_view_of_pe lays sections at their RVAs via mmap and
  applies base relocations; PEB/loader paths guarded to Windows.
- RTTI.cpp: read MSVC TypeDescriptor directly (host std::type_info ABI differs);
  range-gated fault-safe vtable scan.
- Scan/Pattern/String/Config/etc.: ppl shim, intrinsics, SEH/AV guards,
  UTF-8<->wchar narrow/widen; enable .pdata function analysis on x86-64.
- PDB.cpp compiled Windows-only (DIA SDK / urlmon).
- cmake.toml: conditional shlwapi/threads + compat include path (cmkr regen).
- tests: cross-platform build (symlink, conditional targets), Windows-only
  fixtures gated, new TestLinuxPE.cpp maps a committed MSVC sample DLL and
  asserts exports/imports/sections/string/displacement/.pdata/RTTI.
  13 suites / 176 tests pass on Linux.
Adds build-linux (kananlib / kananlib-nolog / kananlib-cli) and test-linux
(configure, build, ctest) jobs on ubuntu-latest using clang + ninja, mirroring
the existing Windows lanes so the Linux support added in this branch is covered
by CI. cmkr skips generation under CI, so the committed CMakeLists.txt is used.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds Linux (clang) support for kananlib’s file-mapped binary analysis path by introducing a non-Windows Win32 compatibility surface (types + VM APIs + SEH-like fault guards), adjusting platform-conditional behavior, and adding a Linux-focused PE mapping/scanning test suite.

Changes:

  • Add include/compat/* shims + src/compat/WinCompat.cpp to emulate the Win32 subset kananlib relies on (Virtual* APIs, pointer probes, AV/SEH guards).
  • Implement non-Windows PE “SEC_IMAGE-like” mapping in map_view_of_pe (section layout at RVAs + base relocations + per-section protections).
  • Update tests/build/CI for Linux: add TestLinuxPE.cpp, gate Windows-only tests, and add Linux GitHub Actions jobs.

Reviewed changes

Copilot reviewed 60 out of 61 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
test/TestVtableHook.cpp Normalize <windows.h> include casing
test/TestUtils.cpp Normalize <windows.h> include casing
test/TestScanResolve.cpp Normalize <windows.h> include casing
test/TestScanPath.cpp Normalize <windows.h> include casing
test/TestScanCoverage.cpp Normalize <windows.h> include casing
test/TestScanBugRegression.cpp Gate Win32-host-only assertions on Linux
test/TestScanBounds.cpp Normalize <windows.h> include casing
test/TestScan.cpp Normalize <windows.h> include casing
test/TestRTTICoverage.cpp Normalize <windows.h> include casing
test/TestRegistry.cpp Normalize <windows.h> include casing
test/TestPDBRTTI.cpp Normalize <windows.h> include casing
test/TestPDBEdge.cpp Normalize <windows.h> include casing
test/TestModuleMachO.cpp Normalize <windows.h> include casing
test/TestModuleCoverage.cpp Normalize <windows.h> include casing
test/TestModule.cpp Normalize <windows.h> include casing
test/TestMiscCoverage.cpp Skip live-PE-host emulation tests on Linux
test/TestLinuxPE.cpp New Linux PE mapping + scan/RTTI coverage test
test/TestInput.cpp Normalize <windows.h> include casing
test/TestHelpers.hpp Add TEST_SKIP macro for platform gating
test/TestEmulation.cpp Skip HMODULE-based test when no Win32 host
test/TestBugRegression.cpp Normalize <windows.h> include casing
test/TestBehavior.cpp Skip live-host/thread-suspension tests on Linux
test/TestAdvanced.cpp Normalize <windows.h> include casing
test/StressTest.cpp Normalize <windows.h> include casing
test/samples/kananlib_sample.cpp New MSVC-ABI sample source for committed DLL
test/CMakeLists.txt Split Windows-only vs non-Windows test targets
test/cmake.toml Add platform conditions + Linux PE test target
src/VtableHook.cpp Add missing <cstring> include
src/Thread.cpp Normalize <windows.h>/<tlhelp32.h> casing
src/String.cpp Add non-Windows UTF-8⇄wstring conversion path
src/Scan.cpp Replace SEH uses with portable KANANLIB_* guards; x64 macro broaden
src/RTTI.cpp Add non-Windows MSVC TypeDescriptor view + AV guards
src/PDB.cpp Make translation unit Windows-only (#if _WIN32)
src/Pattern.cpp Replace AV guards + fix isspace UB with unsigned char
src/Patch.cpp Normalize <windows.h> include casing
src/Module.cpp Implement non-Windows PE mapping/relocations + guard loader paths
src/Memory.cpp Normalize <windows.h> include casing
src/Config.cpp Use narrow paths on non-Windows for fstreams
src/compat/WinCompat.cpp New: mmap/mprotect-backed Virtual* + signal-based AV/SEH emulation
README.md Document Linux file-mapped analysis support + build steps
include/utility/VtableHook.hpp Normalize <windows.h> include casing
include/utility/String.hpp Add <cstdint> include for new conversion code
include/utility/Seh.hpp New portable SEH/AV guard macros + Linux implementation
include/utility/Scan.hpp Normalize include casing + include portable SEH header
include/utility/RTTI.hpp Normalize <windows.h> include casing
include/utility/Patch.hpp Normalize <windows.h> include casing
include/utility/Module.hpp Normalize <windows.h> include casing
include/compat/winternl.h New: non-Windows PEB/LDR type shims
include/compat/winnt_compat.h New: non-Windows WinNT/PE structures + constants
include/compat/windows.h New: non-Windows windows.h shim declarations
include/compat/vcruntime.h New: stub for non-Windows builds
include/compat/tlhelp32.h New: stub toolhelp API declarations
include/compat/shlwapi.h New: PathRemoveFileSpecW declaration
include/compat/seh_compat.h New: redirect to utility/Seh.hpp
include/compat/rttidata.h New: MSVC RTTI on-disk structure definitions
include/compat/ppl.h New: serial parallel_for shim
include/compat/intrin.h New: MSVC intrinsics shim for Linux clang
CMakeLists.txt Add compat sources/headers + platform-dependent linking/includes
cmake.toml Expand sources + platform-conditional shlwapi/threads/dl
.github/workflows/dev-release.yml Add Linux build + test jobs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/compat/WinCompat.cpp
Comment thread src/Module.cpp
Comment thread src/Module.cpp
Comment thread src/RTTI.cpp
Comment thread README.md Outdated
- WinCompat: use gettid() for GetCurrentThreadId instead of pthread_self()
  (kernel TID, avoids pthread_t->DWORD truncation and reliance on a transitive
  <pthread.h> include).
- Module.cpp (Linux PE mapper): bound the header copy by image_size and clamp
  the base-relocation directory extent to image_size, so a malformed PE with
  SizeOfHeaders or reloc Size larger than SizeOfImage cannot overrun the mmap.
- RTTI.hpp: document that the std::type_info* returned by get_type_info aliases
  the in-image MSVC TypeDescriptor off Windows (valid for identity comparison,
  not for std::type_info member calls).
- README: scope the support claim to Linux specifically (the compat layer reads
  /proc/self/maps), not "other POSIX systems".
@praydog

praydog commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator Author

Thanks for the review. Addressed in a follow-up commit; point by point:

1. WinCompat.cpp -- pthread_self() / missing <pthread.h> (valid). Switched GetCurrentThreadId to gettid() (syscall(SYS_gettid)), the second option suggested. It returns the kernel TID (matching the Win32 GetCurrentThreadId notion of a small per-thread id), avoids the pthread_t->DWORD truncation, and drops the <pthread.h> dependency entirely. (It did compile before, via a transitive include, but relying on that is fragile -- agreed.)

2. Module.cpp -- header copy not bounded by image_size (valid). Clamped the header memcpy to min(headers_size, file_size, image_size). This maps untrusted binaries, so a malformed SizeOfHeaders > SizeOfImage overrunning the VirtualAlloc(image_size) region was a real memory-safety bug.

3. Module.cpp -- relocation extent not bounded by image_size (valid). Clamped reloc_end = min(VirtualAddress + Size, image_size). With the existing SizeOfBlock > (reloc_end - reloc_rva) break, the per-block entry array now stays inside the mapping even with a corrupt reloc directory size.

5. README -- "other POSIX systems" overclaims (valid). Scoped the wording to Linux specifically and called out the /proc/self/maps dependency. macOS/*BSD are not supported as-is.

4. RTTI.cpp -- get_type_info returns std::type_info* off Windows (partially). Real footgun, agreed, and I documented it on the API: off Windows the returned pointer aliases the in-image MSVC TypeDescriptor, valid for pointer-identity comparison but not for std::type_info member calls.

I did not make the std::type_info*-returning overloads Windows-only, for two reasons: (a) for_each_uncached (the core vtable scan that runs on Linux) calls get_type_info(const void*) internally, so gating it off would break Linux RTTI; and (b) the pointer is genuinely useful on Linux for identity -- derives_from(obj, std::type_info*) is pure pointer comparison and works. The unsafe operation is specifically ->name()/->raw_name() via the host vtable, which the new doc-comment warns against; the safe cross-platform entry points (find_vtable / find_vtables / derives_from(obj, type_name), matching the decorated .?AV... name) already cover Linux callers. A dedicated cross-platform RTTI return type would be a larger public-API change -- happy to do that as a follow-up if you prefer it over the documented contract.

Linux and Windows test suites both remain green.

@praydog

praydog commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator Author

@copilot please re-review commit f2e6916 ("Address PR review (Copilot)"), which implements your five review comments, and confirm each fix is valid:

  1. GetCurrentThreadId now uses gettid() (syscall(SYS_gettid)) instead of pthread_self(), so there is no dependency on <pthread.h> and no pthread_t -> DWORD truncation.
  2. The non-Windows PE header copy is clamped to min(headers_size, file_size, image_size).
  3. Base-relocation parsing clamps reloc_end to image_size, so a corrupt reloc-directory size cannot read entries past the mapping.
  4. The off-Windows aliasing of get_type_info is now documented in RTTI.hpp.
  5. The README support claim is scoped to Linux and notes the /proc/self/maps dependency.

On point 4 I need a decision rather than an open-ended offer. Please pick one:

  • (a) Accept the documented contract: the std::type_info*-returning overloads stay on both platforms, with the doc-comment warning that off Windows the pointer aliases the in-image MSVC TypeDescriptor (valid for pointer-identity comparison via derives_from(obj, ti), not for std::type_info member calls); name queries go through find_vtable / find_vtables / derives_from(obj, type_name).
  • (b) Require a dedicated cross-platform RTTI return type and make the raw std::type_info* overloads Windows-only.

If you choose (b), say so and I will implement it in this PR. If (a), please confirm the documented contract is acceptable. Note that for_each_uncached (the Linux vtable scan) calls get_type_info(const void*) internally, so any gating must keep an internal accessor available.

Copilot AI commented Jun 19, 2026

Copy link
Copy Markdown

@copilot please re-review commit f2e6916 ("Address PR review (Copilot)"), which implements your five review comments, and confirm each fix is valid:

  1. GetCurrentThreadId now uses gettid() (`sysc...

Re-reviewed f2e6916 and all five fixes are valid:

  1. GetCurrentThreadId now uses syscall(SYS_gettid) in src/compat/WinCompat.cpp, removing pthread_self() usage and any <pthread.h> dependency/truncation risk.
  2. Non-Windows PE header copy is now bounded by min(headers_size, file_size, image_size) in src/Module.cpp.
  3. Relocation parsing now clamps reloc_end to image_size; with the existing block-size guard against (reloc_end - reloc_rva), block entry iteration stays within the mapped image.
  4. RTTI.hpp now documents the non-Windows aliasing contract for get_type_info and the safe usage boundaries.
  5. README.md now scopes support to Linux and explicitly notes the /proc/self/maps dependency.

Decision on point 4: (a) accept the documented contract. Keeping the std::type_info* overloads cross-platform is acceptable with the current warning and safe name-based alternatives.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 60 out of 61 changed files in this pull request and generated 4 comments.

Comment thread src/compat/WinCompat.cpp Outdated
Comment thread src/String.cpp
Comment thread test/TestLinuxPE.cpp
Comment thread test/TestLinuxPE.cpp
- WinCompat VirtualQuery: report a one-page free region instead of a
  near-SIZE_MAX span when an address falls in a gap past the last mapping, so
  callers advancing by BaseAddress + RegionSize cannot overflow/wrap.
- String narrow(): clamp invalid scalar values (surrogates, > U+10FFFF, and
  negative wchar_t which casts to a huge value) to U+FFFD so the UTF-8 output
  is never malformed.
- TestLinuxPE: assert get_module_size() has a value before dereferencing it.
@praydog

praydog commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator Author

@copilot all four comments from the latest review are valid and are addressed in da2f9f3:

  1. VirtualQuery gap RegionSize — when an address falls in a gap past the last mapping, the shim now reports a one-page free region instead of next_start = (uintptr_t)-1, so callers advancing by BaseAddress + RegionSize cannot overflow/wrap. (In practice the existing region-walk callers already terminated, because the span landed exactly on UINTPTR_MAX, which is > end; the bounded value removes the latent hazard for any future caller.)
  2. narrow() UTF-8 validation — invalid scalar values (UTF-16 surrogates U+D800..U+DFFF, anything > U+10FFFF, and negative wchar_t which casts to a huge value) are now clamped to U+FFFD before encoding, so the output is never malformed UTF-8.
    3 & 4. TestLinuxPE optional dereference — both *get_module_size(...) sites now assert has_value() before dereferencing.

Linux test suite remains green (13/13). Please confirm.

Copilot AI commented Jun 19, 2026

Copy link
Copy Markdown

@copilot all four comments from the latest review are valid and are addressed in da2f9f3:

  1. VirtualQuery gap RegionSize — when an address falls in a gap past the last mapping, the shim now rep...

Confirmed on da2f9f3.

  1. VirtualQuery now uses a bounded one-page fallback when there is no later mapping, so BaseAddress + RegionSize cannot wrap.
  2. narrow() now replaces invalid scalar values with U+FFFD before UTF-8 encoding.
  3. and 4. Both get_module_size() call sites in TestLinuxPE.cpp now assert has_value() before dereferencing.

I also rebuilt and ran the Linux test workflow locally (cmake -S test -B test/build -G Ninja ... && ctest --test-dir test/build --output-on-failure): 13/13 tests passed. The current GitHub Actions run for da2f9f3 has no failed jobs so far.

praydog added 6 commits June 18, 2026 23:09
Prefer honest failure/warnings over papering over unimplemented behavior:

- VirtualAlloc no longer upgrades PAGE_NOACCESS (PROT_NONE) to PROT_READ|
  PROT_WRITE. The protection is honored faithfully, so a reservation reads as
  inaccessible via VirtualQuery/IsBadReadPtr (matching Windows); callers commit
  it later via VirtualProtect.
- win_to_posix_prot warns on an unmodeled page protection and denies
  (PROT_NONE) instead of guessing a permissive mapping.
- The Linux PE mapper warns when it encounters base-relocation types it does
  not apply (anything other than DIR64/HIGHLOW/ABSOLUTE), noting the image may
  be incompletely relocated, rather than silently skipping them.
- Document the compat-layer scope policy in compat/windows.h.
- Seh.hpp: document that non-Windows fault recovery uses siglongjmp and does
  NOT unwind / run C++ destructors (Windows /EHa does), so KANANLIB_*_TRY block
  bodies must hold only trivially destructible locals. (Named merge gate.)
- Module.cpp (Linux PE mapper): reject a DOS e_lfanew whose NT headers extend
  past SizeOfImage, so downstream code re-reading the NT headers from
  mapped_base + e_lfanew cannot read past the mapping (previously safe only by
  mmap zero-init).
- WinCompat VirtualFree: document that MEM_DECOMMIT and sub-range MEM_RELEASE
  are deliberately not modeled.
- test sample: note origin/license (written for this repo, covered by LICENSE).
Bug 1 (active): VirtualProtect(size=0) mprotected a full page instead of
being a no-op. Windows docs: 'If the size is 0, the function returns
non-zero and does not modify the page protection.' Added early return.

Bug 2 (active): map_view_of_pe had no bounds check on the PE section
table. A malformed PE with large SizeOfOptionalHeader or NumberOfSections
caused a heap buffer over-read (segfault). Added validation before the
section iteration loop.

Bug 4 (active): VirtualQuery on MEM_FREE regions reported Protect=PAGE_NOACCESS
(0x01) instead of 0 per Windows spec.

Bug 5 (active): VirtualQuery on MEM_FREE regions returned the raw query
address as BaseAddress instead of page-aligning it.

Bug 3 (latent): _BitScanForward/_BitScanReverse passed unsigned long
(64-bit on LP64) to __builtin_ctz/__builtin_clz (documented for
unsigned int). For mask > UINT_MAX, implicit narrowing to 0 caused UB
(__builtin_clz(0)). Fixed by explicit (uint32_t) cast. No callers in
kananlib, so this was latent.

Bug 6 (unproven): /proc/self/maps fgets 512-byte buffer could drop
regions with long paths. Changed to getline(). Cannot demonstrate failure
(controlled by kernel), kept as defensive improvement.

Test: 19 regression tests in TestLinuxAudit.cpp. 4 bugs demonstrated with
revert->segfault/fail->re-apply->pass methodology. 14/14 full suite pass.
…e page-aligned base

74a55b0 page-aligned MEM_FREE BaseAddress but left RegionSize computed from the
raw query address, so BaseAddress + RegionSize no longer equaled next_start (it
undershot by the intra-page offset). An unaligned gap region-walk could then
oscillate instead of advancing to the next mapping. Measure RegionSize from the
aligned base so the documented invariant holds. Latent (no current caller walks
unaligned gaps), found auditing 74a55b0.
Out-of-scope items for the file-mapped-analysis PR (ELF, live-process
primitives, embedding/signal-handler chaining, maps-cache refresh, RTTI API
cleanup, perf, pre-existing latent issues, and by-design fidelity gaps) for
future feature/bug branches.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 62 out of 63 changed files in this pull request and generated 6 comments.

Comment thread test/TestLinuxAudit.cpp
Comment thread test/TestBehavior.cpp
Comment thread test/TestBehavior.cpp
Comment thread src/String.cpp Outdated
Comment thread test/cmake.toml Outdated
Comment thread src/Module.cpp Outdated
Add regressions for malformed UTF-8 decoding and PE optional-header handling, make widen() validate UTF-8 consistently on Windows and non-Windows, parse PE32 and PE32+ optional headers correctly in the Linux PE mapper, fail PE32 HIGHLOW relocations when the image lands outside 32-bit VA range, add a committed PE32 sample that maps and scans on Linux, add the missing vector include, and fix misleading test comments/skip text.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 65 out of 68 changed files in this pull request and generated 5 comments.

Comment thread src/String.cpp
Comment thread src/compat/WinCompat.cpp
Comment thread test/TestLinuxAudit.cpp Outdated
Comment thread test/TestScanStress.cpp Outdated
Comment thread test/TestLinuxAudit.cpp Outdated
…+ sliding-window stress tests

- widen(): reject overlong UTF-8 encodings (3/4-byte sequences below the
  minimum code point for their length) as U+FFFD instead of emitting the
  decoded scalar, matching Windows. Regression in TestUtils.
- compat VirtualFree: only model MEM_RELEASE of a tracked VirtualAlloc base;
  fail loudly (warn + FALSE) for MEM_DECOMMIT, sub-range / size!=0, and
  untracked addresses instead of guessing a length to munmap. Regression in
  TestLinuxAudit.
- MEM_FREE VirtualQuery audit tests now probe a just-freed allocation instead
  of a hard-coded high address (deterministic under ASLR).
- scan stress: default to a fast 16 MiB x 64 run (gated full 1 GiB x 512 via
  KANANLIB_SCAN_STRESS_FULL) and add an exhaustive per-byte sliding-window
  boundary sweep across sizes that drive both AVX2 branches and the scalar
  tail. Proven to catch a deliberately broken window.
- Fix the boundary bugs the sweep exposed in the scalar relative-reference
  scans: scan_relative_reference_scalar left a multi-byte gap before end-4
  (only the single end-4 position was re-checked), and the byte-by-byte
  variant (i+4<end) missed the final end-4 position. Both now cover the full
  [start, end-4] range, so scan_relative_reference no longer misses a match in
  the last bytes of a buffer.

Linux 15/15, Windows 24/24.
@praydog

praydog commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator Author

Addressed the latest review (commit ec8f0ef). Each genuine bug was proven failing with a test first, then fixed; test-quality comments were applied as suggested.

Bugs (proven failing, then fixed):

  • widen() overlong UTF-8 (src/String.cpp): the lossy decoder accepted overlong 3/4-byte encodings and emitted the decoded scalar (e.g. an embedded NUL). It now treats sequences below the minimum code point for their length as U+FFFD, matching Windows' MultiByteToWideChar. Regression added in TestUtils (test_widen_handles_malformed_utf8).
  • VirtualFree guessed unmap (src/compat/WinCompat.cpp): when the address wasn't a tracked VirtualAlloc base it fell back to munmap-ing a guessed length, which could unmap an unrelated mapping and diverged from Win32. It now only models MEM_RELEASE of a tracked base and fails loudly (warn + FALSE) for MEM_DECOMMIT, sub-range / size != 0, and untracked addresses. Regression added in TestLinuxAudit (test_virtual_free_untracked_refused).

Test-quality fixes:

  • The MEM_FREE VirtualQuery audit tests now probe a just-freed VirtualAlloc range instead of a hard-coded high address, so they're deterministic under ASLR.
  • The scan stress test now defaults to a fast 16 MiB x 64 run (the full 1 GiB x 512 run is gated behind KANANLIB_SCAN_STRESS_FULL) so CI stays fast. Per the test's real intent (stressing the AVX2 sliding window for stability, not perf), I also added an exhaustive per-byte sliding-window boundary sweep across sizes that drive both AVX2 branches and the scalar tail, and verified it catches a deliberately broken window.

Bonus bugs the new sweep exposed (proven, then fixed) in src/Scan.cpp:

  • scan_relative_reference_scalar left a multi-byte gap before end-4 (the fast 8-byte loop stopped at i + 8 < end and only the single end-4 position was re-checked), so scan_relative_reference could miss a match in the final bytes of a buffer.
  • the byte-by-byte variant used i + 4 < end and missed the final end-4 position.
  • Both now cover the full [start, end-4] range.

Verification: Linux 15/15 and Windows (MSVC, VS 2022, DIA) 24/24, both green.

@copilot please review these changes and validate the fixes.

@praydog

praydog commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator Author

@copilot please post a written review comment on this PR auditing the changes in commit ec8f0ef (do not just respond in chat — leave a comment here for the audit trail). For each item below, explicitly state ACCEPT or PUSH BACK with your reasoning:

  1. src/String.cpp widen_utf8_lossy: rejecting overlong UTF-8 (3/4-byte sequences below the minimum code point for their length), surrogates, and out-of-range scalars as U+FFFD. Does this match Windows MultiByteToWideChar(CP_UTF8, 0, ...) semantics?
  2. src/compat/WinCompat.cpp VirtualFree: only modeling MEM_RELEASE of a tracked VirtualAlloc base and failing loudly (warn + FALSE) for MEM_DECOMMIT, size != 0, and untracked addresses. Is failing loudly the correct choice vs. guessing a length to munmap?
  3. test/TestLinuxAudit.cpp: the MEM_FREE VirtualQuery tests now probe a just-freed VirtualAlloc range instead of a hard-coded high address. Is this deterministic under ASLR, and does the new test_virtual_free_untracked_refused correctly cover the bug?
  4. test/TestScanStress.cpp: default 16 MiB x 64 run with the full 1 GiB x 512 run gated behind KANANLIB_SCAN_STRESS_FULL, plus the new exhaustive per-byte sliding-window boundary sweep across sizes that drive both AVX2 branches and the scalar tail. Is the smaller default sufficient to stress the AVX2 sliding window for stability?
  5. src/Scan.cpp boundary fixes the sweep exposed: scan_relative_reference_scalar left a multi-byte gap before end-4 (only end-4 was re-checked), and the byte-by-byte variant (i + 4 < end) missed the final end-4 position. Both now cover the full [start, end-4] range. Confirm these are correct and don't regress the hot path.

If you find any issue, push back with specifics; otherwise explicitly accept each item.

Copilot AI commented Jun 19, 2026

Copy link
Copy Markdown

@copilot please post a written review comment on this PR auditing the changes in commit ec8f0ef (do not just respond in chat — leave a comment here for the audit trail). For each item below, e...

Audited ec8f0ef. Per-item verdict:


1. src/String.cpp widen_utf8_lossy — overlong / surrogate / out-of-range → U+FFFD

ACCEPT.

min_cp is set to 0x80 / 0x800 / 0x10000 for 2- / 3- / 4-byte lead bytes respectively. Any decoded scalar below that threshold is overlong and is replaced with U+FFFD. Surrogates U+D800..U+DFFF and scalars > U+10FFFF get the same treatment. This matches MultiByteToWideChar(CP_UTF8, 0, ...) exactly: Windows rejects all three classes as invalid input.


2. src/compat/WinCompat.cpp VirtualFree — fail loudly vs. guessing a length

ACCEPT.

The previous fall-through to munmap(addr, size ?: PAGE_SIZE) for untracked or MEM_DECOMMIT calls was the wrong choice: it could silently destroy a mapping the caller did not intend to release, with no diagnostic. The new path logs a warning and returns FALSE for every unsupported variant (MEM_DECOMMIT, size != 0, and untracked base), and only calls munmap when the address is in the tracking table and size == 0. That's correct Win32 semantics and prevents the munmap-anything footgun.


3. test/TestLinuxAudit.cpp — just-freed ranges instead of hard-coded high addresses

ACCEPT.

test_virtual_query_free_protect and test_virtual_query_free_base_aligned both allocate via VirtualAlloc, release via VirtualFree(p, 0, MEM_RELEASE), and then probe the same address. That address is guaranteed to be freed by the time VirtualQuery is called, regardless of ASLR. test_virtual_free_untracked_refused covers the bug directly: it creates a mapping outside the tracking table, asserts VirtualFree returns FALSE and leaves the mapping intact, then verifies that MEM_DECOMMIT and size != 0 on a tracked base also fail loudly before the correct (size==0, MEM_RELEASE) call succeeds. The coverage is complete.


4. test/TestScanStress.cpp — 16 MiB × 64 default + per-byte sliding-window sweep

ACCEPT.

16 MiB is ~41k times the 392-byte unrolled-branch threshold, so every AVX2 code path (small branch 128–391 bytes and the big unrolled branch ≥ 392 bytes) is exercised by the random-alignment loop at default settings. The sliding-window sweep (test_displacement_scan_sliding_window_boundaries) places a single match at every byte offset across sizes 129, 200, 256, 391, 392, 393, 512, 1024, and 4096, which directly stresses the 32-byte vector-load boundaries, the 12×32 unrolled stride, and the main-loop/scalar-tail handoff. The sweep is small (< 1 ms each size) and is unconditional, so it always runs on CI. Together these give better boundary coverage than the original 1 GiB stress run while staying fast.


5. src/Scan.cpp boundary fixes — scalar tail loop and byte-by-byte condition

ACCEPT.

The fast 8-byte loop uses i + sizeof(uint64_t) < end (strict less-than), which stops when fewer than 8 bytes remain and can leave up to 7 unscanned positions. The tail loop for (; i + sizeof(uint32_t) <= end; ++i) picks up exactly from where the fast loop left off and uses <= (i.e. i ≤ end − 4), so the final valid 4-byte position is included — the gap is closed.

scan_relative_reference_scalar_byte_by_byte already used i + sizeof(uint32_t) <= end, so it already covered [start, end−4] inclusive; the version prior to ec8f0ef reportedly had i + 4 < end which would miss the last position — the <= fix is correct and does not over-read.

Neither change affects the hot-path AVX2 implementation, and the sliding-window sweep proves all three implementations agree at every byte offset for all swept sizes.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 65 out of 68 changed files in this pull request and generated 8 comments.

Comment thread src/compat/WinCompat.cpp
Comment thread test/TestHelpers.hpp
Comment thread test/TestScanStress.cpp
Comment thread test/TestScanStress.cpp
Comment thread test/TestScanStress.cpp Outdated
Comment thread test/TestScanStress.cpp
Comment thread test/TestScanStress.cpp Outdated
Comment thread src/PDB.cpp
…s, well-defined test stores

- compat VirtualAlloc: round the allocation up to page granularity and track the
  rounded size so VirtualFree always passes a page-granular length to munmap
  (matches Win32 reservation granularity).
- PDB: add non-Windows no-op stubs for the whole utility::pdb API so it links on
  Linux (returns empty/nullopt + warns) instead of leaving undefined references
  for consumers. Proven by a new TestLinuxAudit test that references the API and
  would not link without the stubs.
- TestScanStress: place/reset rel32 values via memcpy (store_i32) instead of
  unaligned int32_t* stores, which were UB and could fault on strict-alignment
  targets / under sanitizers.
- TestHelpers utf16le_bytes: clamp surrogate and out-of-range scalars to U+FFFD
  instead of emitting malformed surrogate pairs.

Linux 15/15, Windows 24/24.
@praydog

praydog commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator Author

Addressed review 4532951094 (8 comments) in commit d5cb150. Per item:

  1. VirtualAlloc page granularity (WinCompat.cpp) — Applied. VirtualAlloc now rounds the size up to a whole number of pages (with an overflow guard) and tracks the rounded size, so VirtualFree always passes a page-granular length to munmap, matching Win32 reservation granularity. Note: the specific "leak / VirtualFree returns FALSE" premise doesn't actually hold on Linux — munmap requires only the address to be page-aligned, not the length (man 2 munmap), so the prior code already unmapped the covering pages. The rounding is still the correct, more faithful behavior, so I applied it.
  2. utf16le_bytes scalar validation (TestHelpers.hpp) — Fixed. Non-Windows path now clamps surrogate-range and > U+10FFFF scalars to U+FFFD before encoding, instead of emitting malformed surrogate pairs.
    3-7. Unaligned int32_t* stores in TestScanStress.cppFixed. All rel32 placements and resets (random test + sliding-window sweep) now go through a store_i32() helper that uses std::memcpy, so the intentionally-unaligned writes are well-defined and sanitizer/strict-alignment safe.
  3. utility::pdb undefined on Linux (PDB.cpp / PDB.hpp) — Fixed. Added non-Windows no-op stubs for the entire utility::pdb API (return nullopt/empty + warn), so it links on Linux instead of leaving undefined references for consumers — consistent with how the rest of the compat layer degrades unsupported Windows features. Proven by a new TestLinuxAudit test (test_pdb_api_unsupported_on_linux) that references every pdb symbol; without the stubs it fails to link (verified: undefined references to all 7 symbols), and with them it links and returns empty.

Verification: Linux 15/15, Windows (MSVC, VS 2022, DIA) 24/24, both green.

@copilot please post a written review comment on this PR (not just chat — leave it here for the audit trail) auditing commit d5cb150, stating ACCEPT or PUSH BACK with reasoning for each of the 4 areas above (VirtualAlloc rounding, utf16le_bytes validation, TestScanStress memcpy stores, PDB Linux stubs).

Copilot AI commented Jun 19, 2026

Copy link
Copy Markdown

Addressed review 4532951094 (8 comments) in commit d5cb150. Per item:

  1. VirtualAlloc page granularity (WinCompat.cpp) — Applied. VirtualAlloc now rounds the size up to a whole nu...

Audited d5cb150. Per-item verdict:


1. src/compat/WinCompat.cpp VirtualAlloc page granularity — ACCEPT

The fix rounds size up with (size + ps - 1) & ~(ps - 1) before mmap, using a correct SIZE_MAX overflow guard (size > SIZE_MAX - (ps - 1)), and stores the rounded value in the tracking table. VirtualFree therefore always calls munmap with a page-granular length and the tracked size precisely matches the mapped extent. This matches Win32's reservation-granularity semantics. The overflow guard correctly handles pathological sizes without silent truncation.


2. test/TestHelpers.hpp utf16le_bytes scalar validation — ACCEPT

The guard if (cp > 0x10FFFF || (cp >= 0xD800 && cp <= 0xDFFF)) { cp = 0xFFFD; } is applied before the BMP / surrogate-pair branch. It catches all three invalid classes: out-of-Unicode-range values, UTF-16 surrogates, and (implicitly, via > 0x10FFFF) negative wchar_t values that cast to a large uint32_t. The result is that every invalid scalar produces a well-formed U+FFFD pair in the output rather than a malformed surrogate. The condition order (> 0x10FFFF first) is fine; both sub-expressions are unsigned comparisons.


3–7. test/TestScanStress.cpp unaligned int32_t* stores — ACCEPT

The store_i32(uint8_t* dst, int32_t value) helper wraps std::memcpy and is used in all seven previously-UB store sites: the three placements in the random-alignment loop, the three resets, and the one placement and one reset in sweep_sliding_window. All casts through int32_t* are gone. The helper is static, inlines trivially, and the memcpy idiom is recognized by every modern compiler as a 4-byte store with no runtime overhead at the aligned sites.

One minor note: the third placement (store_i32((uint8_t*)(address_to_rel32_reference - 4), ...)) still derives the pointer via pointer arithmetic on a uintptr_t cast, which is the same pattern as before. This is fine on the target platforms (x86-64 Linux/Windows), but the static_cast<uint8_t*> form would be cleaner for strict-aliasing purposes. Not a blocker.


8. src/PDB.cpp / include/utility/PDB.hpp non-Windows stubs — ACCEPT

The #else // !_WIN32 block covers all eight functions declared in PDB.hpp (get_pdb_path, get_symbol_address, get_symbol_name, get_symbol_map, enumerate_symbols, get_struct_info, enumerate_structs, generate_c_struct). Return values are consistent with the declared signatures: std::nullopt for optional<> returns, {} for containers and std::string. The six data-access stubs emit SPDLOG_WARN; generate_c_struct returns empty silently (reasonable, since its input can only be populated on Windows via get_struct_info). The new test_pdb_api_unsupported_on_linux test in TestLinuxAudit.cpp force-resolves every symbol at link time and asserts each stub returns empty/nullopt, providing a reliable link-time regression guard. The test is registered in main().


All eight items are correctly implemented. No push-backs.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 65 out of 68 changed files in this pull request and generated 2 comments.

Comment thread src/compat/WinCompat.cpp
Comment thread src/PDB.cpp Outdated
- compat VirtualProtect: reject address+size overflow before the span/rounding
  math (which would otherwise wrap and mprotect an unintended region). Added a
  regression that the overflowing call returns FALSE and leaves the page
  untouched.
- PDB.cpp: correct the now-stale header comment to describe the non-Windows
  stub branch (links + returns empty/nullopt + warns) instead of claiming the
  TU compiles to nothing.

Linux 15/15, Windows 24/24.
@praydog

praydog commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator Author

Addressed review 4533148154 (2 comments) in commit 2531343. Both valid, no push-back:

  1. VirtualProtect address+size overflow (WinCompat.cpp) — Fixed. Added an overflow guard (if (size > UINTPTR_MAX - addr) return FALSE; with a warning) before the span/rounding arithmetic, so an overflowing range can no longer wrap to a small length and mprotect an unintended region. After the guard, span <= size + (ps-1), so the rounding add can't overflow either. Regression added (test_virtual_protect_overflow_size_refused): the overflowing call returns FALSE and the page stays writable. Note: this is hardening — a deterministic pre-fix failing repro isn't safely constructible, because for an overflowing range the wrapped mprotect length spans unmapped memory and fails with ENOMEM either way; the guard's value is preventing the case where the wrap produces a small valid length that silently protects the wrong region.
  2. Stale PDB.cpp header commentFixed. Updated to describe the actual non-Windows behavior (the #else branch provides stubs that link and return empty/nullopt + warn) instead of claiming the TU compiles to nothing.

Verification: Linux 15/15, Windows (MSVC, VS 2022, DIA) 24/24, both green.

@copilot please post a written review comment on this PR (not just chat — leave it here for the audit trail) auditing commit 2531343, stating ACCEPT or PUSH BACK with reasoning for the VirtualProtect overflow guard and the PDB comment update.

Copilot AI commented Jun 19, 2026

Copy link
Copy Markdown

Addressed review 4533148154 (2 comments) in commit 2531343. Both valid, no push-back:

  1. VirtualProtect address+size overflow (WinCompat.cpp) — Fixed. Added an overflow guard (`if (...

Audited 2531343. Per-item verdict:


1. src/compat/WinCompat.cpp VirtualProtect overflow guard — ACCEPT

The guard if (size > UINTPTR_MAX - addr) is the correct unsigned-overflow test: it is equivalent to addr + size > UINTPTR_MAX and triggers before any arithmetic that touches addr + size. After the guard passes, span = (addr + size) - aligned and the rounding add span + (ps - 1) are both well-bounded — span ≤ addr + size - aligned ≤ size + (ps - 1), so span + (ps - 1) ≤ size + 2*(ps - 1), which fits in size_t for any size that passed the guard. The SPDLOG_WARN format string is correct (both addr and size are cast appropriately). The regression test constructs overflow_size = (SIZE_T)(0ULL - (uintptr_t)p + 4096), which yields a value such that (uintptr_t)p + overflow_size wraps past UINTPTR_MAX; the guard correctly triggers, VirtualProtect returns FALSE, and the page remains writable — the write to *p = 0xA5 and the subsequent read confirm no mprotect(PROT_NONE) took place. Note: the author correctly acknowledges that a deterministic pre-fix crash was not constructible (the wrapped length would span unmapped memory and fail with ENOMEM anyway), so the guard's real value is preventing the silent wrong-region case; the test validates the observable postcondition.


2. src/PDB.cpp header comment — ACCEPT

The updated comment accurately describes the actual TU structure: the #if defined(_WIN32) branch contains the full DIA/urlmon implementation, and the #else branch (added in the previous commit) provides no-op stubs that link and return nullopt/empty with a warning. The old "compiles to nothing / simply unavailable" text was stale and misleading after the stubs were introduced; the new wording is consistent with the code.

@praydog praydog merged commit f1a9384 into main Jun 19, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants