Skip to content

feat(allocator): add optional rpmalloc abstraction layer and adapter#816

Draft
Vansh-kap-98 wants to merge 2 commits into
sourcemeta:mainfrom
Vansh-kap-98:feat/rpmalloc-integration
Draft

feat(allocator): add optional rpmalloc abstraction layer and adapter#816
Vansh-kap-98 wants to merge 2 commits into
sourcemeta:mainfrom
Vansh-kap-98:feat/rpmalloc-integration

Conversation

@Vansh-kap-98
Copy link
Copy Markdown

Overview
Introduces an optional, opt-in rpmalloc wrapper for the Blaze allocator layer to handle high-concurrency memory tracking without modifying core validation logic. It remains completely disabled by default.

Implementation Details
Build System: Enable via -DBLAZE_ALLOCATOR_RPMALLOC=ON. Fetches and pins [email protected] automatically.

Codebase Changes: Added a clean abstraction layer under src/allocator with process and thread lifecycle hooks, alongside a header-only RpmallocAdapter for STL containers.

Safety: Fully guarded via preprocessor directives. Invoking the backend without compiling it first throws a clean std::runtime_error.

Phase 1 & Phase 2 Findings
Phase 1 (Measurement): Running baseline microbenchmarks on unmodified logic proved the concept, showing up to a 10x throughput improvement in isolated, highly targeted validation setups.

Phase 2 (Integration): The core plumbing is complete and reproducible. However, a broad process-wide proof-of-concept swap showed performance regressions across compile and validation metrics in quick Release runs.

Direct Benchmark Metrics (Baseline → rpmalloc PoC)
Compile Time: 1.45 ms → 2.85 ms (+96.3%)
Single-Threaded Validate: 287 ns → 480 ns (+67.3%) | Throughput dropped ~46.7%
Concurrent Validate (4 Threads): 87.4 ns → 160 ns (+83.2%)
Concurrent Validate (8 Threads): 58.8 ns → 78.2 ns (+33.0%)

suggestions
Next step: Targeted PMR (polymorphic memory resources) implementation on specific compiler hotspots to isolate performance gains.

First-Time Contributor Note
This is my first pull request on this repository, so I would love to get your constructive criticism, feedback on the styling, or any architectural suggestions you have. If there are specific compiler modules or high-churn data streams you think I should look at next for a targeted PMR implementation, let me know. I am really eager to explore the codebase further and see where else this allocator layer can add value.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 11 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/allocator/allocator.cc">

<violation number="1" location="src/allocator/allocator.cc:19">
P2: rpmalloc_initialize() return value ignored, risking undefined behavior on init failure</violation>
</file>

<file name="benchmark/micro/allocator_profile.cc">

<violation number="1" location="benchmark/micro/allocator_profile.cc:154">
P2: Benchmark comment claims rpmalloc can be tested by setting an environment variable, but the code always uses a default Config with Backend::Standard and initialize() does not parse environment variables. This leads to invalid benchmark results.</violation>
</file>

<file name="src/allocator/include/sourcemeta/blaze/allocator_adapter.h">

<violation number="1" location="src/allocator/include/sourcemeta/blaze/allocator_adapter.h:40">
P2: RpmallocAdapter does not handle over-aligned types; `malloc`/`rpmalloc` only guarantee `max_align_t` alignment, violating the STL allocator contract for types with `alignof(T) > alignof(std::max_align_t)`</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Re-trigger cubic

break;
case Backend::RPMalloc:
#ifdef BLAZE_ALLOCATOR_RPMALLOC
rpmalloc_initialize();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: rpmalloc_initialize() return value ignored, risking undefined behavior on init failure

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/allocator/allocator.cc, line 19:

<comment>rpmalloc_initialize() return value ignored, risking undefined behavior on init failure</comment>

<file context>
@@ -0,0 +1,69 @@
+      break;
+    case Backend::RPMalloc:
+#ifdef BLAZE_ALLOCATOR_RPMALLOC
+      rpmalloc_initialize();
+#else
+      throw std::runtime_error(
</file context>

Comment thread benchmark/micro/allocator_profile.cc
RpmallocAdapter(const RpmallocAdapter<U>&) {}

/// @brief Allocate memory for n elements
[[nodiscard]] pointer allocate(size_type n) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: RpmallocAdapter does not handle over-aligned types; malloc/rpmalloc only guarantee max_align_t alignment, violating the STL allocator contract for types with alignof(T) > alignof(std::max_align_t)

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/allocator/include/sourcemeta/blaze/allocator_adapter.h, line 40:

<comment>RpmallocAdapter does not handle over-aligned types; `malloc`/`rpmalloc` only guarantee `max_align_t` alignment, violating the STL allocator contract for types with `alignof(T) > alignof(std::max_align_t)`</comment>

<file context>
@@ -0,0 +1,63 @@
+  RpmallocAdapter(const RpmallocAdapter<U>&) {}
+
+  /// @brief Allocate memory for n elements
+  [[nodiscard]] pointer allocate(size_type n) {
+#ifdef BLAZE_ALLOCATOR_RPMALLOC
+    return static_cast<pointer>(rpmalloc(n * sizeof(T)));
</file context>

@Vansh-kap-98 Vansh-kap-98 marked this pull request as draft May 16, 2026 13:44
@Vansh-kap-98 Vansh-kap-98 force-pushed the feat/rpmalloc-integration branch 2 times, most recently from 0a45cf2 to 6e9f670 Compare May 16, 2026 22:12
@jviotti
Copy link
Copy Markdown
Member

jviotti commented May 16, 2026

Hey @Vansh-kap-98 , interesting! To clarify, on those benchmark metrics you shared, rpmalloc is slower? Am I reading that correctly? How specifically did you benchmark?

What might be interesting as a step before, would be to try out many potential different allocators. If one seems to clearly win, we would incorporate that one in the build as a default even?

@Vansh-kap-98 Vansh-kap-98 force-pushed the feat/rpmalloc-integration branch from 6e9f670 to 15ada05 Compare May 16, 2026 22:26
@Vansh-kap-98
Copy link
Copy Markdown
Author

Vansh-kap-98 commented May 16, 2026

Hey @jviotti, im sorry for the rough wording earlier. For the PoC I enabled rpmalloc globally to validate behavior, which changed every allocation site at once and caused regressions, I hadn’t targeted specific hot containers.

I’ll finish fixing the clang-format checks first, then run a short allocator matrix like mimalloc and try targeted PMR adoption on compiler hot-path containers.

I’ll post results and progress updates under this draft, i do realise i still have got lots to figure out and learn.

@Vansh-kap-98 Vansh-kap-98 force-pushed the feat/rpmalloc-integration branch 2 times, most recently from 22daaba to a8d0b13 Compare May 16, 2026 22:48
@jviotti
Copy link
Copy Markdown
Member

jviotti commented May 17, 2026

Very much appreciated and looking forward to the results. Overall:

  • Trying our different allocators has definitely been on my TODO list, so I'm excited about what your research reveals. There is a plethora of custom allocators out there optimised for different workloads. I wonder which one would work best for us, if any
  • Beyond Blaze, it might be worth to try this out in https://github.com/sourcemeta/core, which is the core module we use behind even other stuff, like https://one.sourcemeta.com
  • Maybe the answer is not to integrate a specific allocator but understand what allocator patterns better suite our specific needs, and we could even design our own. It might even depend on the module. For example, maybe our sourcemeta::core::JSON JSON parser benefits from one way of doing it, while other modules need something else?
  • Also note that at least on Blaze, the most performance sensitive part is src/evaluator, the actual evaluator. Though BECAUSE of performance, we almost don't do any allocation in there. Very small amounts, which might mean that even with the best custom allocator, you might not be able to move the needle there much. I guess more reason to also try this out on https://github.com/sourcemeta/core. It has its own GoogleBenchmark setup that definitely does a lot more allocations on hot paths

Let me know if I can help in any way. It is exciting research!

@Vansh-kap-98 Vansh-kap-98 force-pushed the feat/rpmalloc-integration branch 3 times, most recently from 5bfabe8 to 31fd9ca Compare May 17, 2026 06:32
Signed-off-by: Vansh <[email protected]>
@Vansh-kap-98 Vansh-kap-98 force-pushed the feat/rpmalloc-integration branch from 31fd9ca to 2ba8e04 Compare May 17, 2026 06:38
@Vansh-kap-98
Copy link
Copy Markdown
Author

Thank you so much for the detailed insight, @jviotti! I really appreciate your guidance. I would love to use this opportunity to dive deeper and research various allocator behaviors and optimization patterns.

As I am still learning and getting comfortable with these custom allocator integrations, my progress might be a bit slower than if I asked for direct help, but I am incredibly eager to figure it out independently. That said, please let me know if there are any deadlines or milestones I should keep in mind!

I plan to work on both core and blaze simultaneously, as I definitely want to see this initial baseline draft through to a clean completion rather than leaving it unfinished. I will collect the benchmarking data and post the results individually across their respective repositories once they are ready.

@jviotti
Copy link
Copy Markdown
Member

jviotti commented May 18, 2026

No hurries and looking forward to anything you find!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants