Skip to content

Fix typed proxy access to generic skeleton event storage#394

Open
rudresh-systream wants to merge 1 commit into
eclipse-score:mainfrom
rudresh-systream:bugfix/311-generic-skeleton-typed-proxy-storage
Open

Fix typed proxy access to generic skeleton event storage#394
rudresh-systream wants to merge 1 commit into
eclipse-score:mainfrom
rudresh-systream:bugfix/311-generic-skeleton-typed-proxy-storage

Conversation

@rudresh-systream
Copy link
Copy Markdown
Contributor

@rudresh-systream rudresh-systream commented May 8, 2026

Summary:

  • Fixed typed ProxyEvent sample access so it no longer depends on interpreting shared memory as EventDataStorage.
  • Added raw-slot based access using sample size/alignment, making typed proxies compatible with GenericSkeleton-created event storage.
  • Kept existing typed skeleton behavior compatible.
  • Updated ProxyEvent and event storage related tests/resources.

Implemented a compatibility fix for the GenericSkeleton ↔ typed ProxyEvent shared-memory interaction described in #311.

Root cause

The issue originates from the fact that GenericSkeleton and typed skeletons create/interprete EventDataStorage differently.

For typed skeletons, the storage is created through:

EventDataStorage<SampleType>

which internally creates a typed DynamicArray<SampleType>.

For GenericSkeleton/GenericSkeletonEvent, the storage is instead created using:

EventDataStorage<std::max_align_t>

with the storage size being calculated in units of std::max_align_t.

As a result, the underlying raw memory region may be large enough, but the metadata/layout interpretation of the DynamicArray<T> becomes incompatible between the producer and consumer sides.

The typed proxy path was still assuming a typed EventDataStorage<T> representation and therefore interpreted slot count/layout using the wrong type information. This creates an incompatibility whenever:

  • a GenericSkeleton publishes samples
  • and a typed proxy consumes them.

This issue is especially visible because the DynamicArray element count depends on the template type T, which differs between the generic and typed paths.


Investigated solution approaches

Several approaches were considered:

1. Extending DynamicArray<T>

One option was to add a constructor allowing externally managed/preallocated storage while preserving a typed DynamicArray<T> interface.

This was not chosen because:

  • it would introduce LoLa/event-storage specific semantics into a generic container abstraction,
  • it increases complexity in DynamicArray,
  • and it still keeps the shared-memory interpretation tightly coupled to template typing.

2. Replacing all event storage with DynamicArray<std::byte>

Another option was to fully migrate all event storage handling to raw byte storage.

While architecturally clean, this would require significantly broader refactoring across:

  • typed skeletons,
  • generic skeletons,
  • proxies,
  • allocation logic,
  • and existing tests.

Given the scope/risk, this approach was considered too invasive for the current issue.

3. Fixing typed proxy access to use raw slot storage (chosen solution)

The implemented solution changes typed proxy sample access so it no longer depends on interpreting the shared memory as EventDataStorage<T>.

Instead:

  • the proxy accesses the shared-memory event region through raw slot memory/meta information,

  • and calculates slot/sample access using:

    • sizeof(T)
    • alignof(T)

This keeps the existing shared-memory layout compatible while allowing typed proxies to correctly consume data produced by GenericSkeleton.

This approach was selected because it:

  • directly resolves the interoperability problem,
  • minimizes architectural disruption,
  • preserves existing typed skeleton behavior,
  • avoids changing generic container semantics,
  • and provides a more robust producer/consumer compatibility model going forward.

Files changed

score/mw/com/impl/bindings/lola/proxy_event.h

Main functional fix.

Updated typed ProxyEvent<T> sample access logic to avoid relying on interpreting the shared-memory region as EventDataStorage<T>.

Instead, access is now performed through raw event slot metadata and pointer arithmetic based on the actual sample type size/alignment.

This makes typed proxies compatible with GenericSkeleton-created storage.


score/mw/com/impl/bindings/lola/skeleton_memory_manager.h

score/mw/com/impl/bindings/lola/skeleton_memory_manager.cpp

Updated/supporting changes around event storage creation and raw slot metadata handling.

These changes ensure both generic and typed paths expose compatible storage information to consumers.


score/mw/com/impl/bindings/lola/skeleton.cpp

Adjusted skeleton-side integration to align with the updated event storage access model and metadata usage.


score/mw/com/impl/bindings/lola/proxy_event_test.cpp

score/mw/com/impl/bindings/lola/test/proxy_event_test_resources.cpp

score/mw/com/impl/bindings/lola/test/proxy_event_test_resources.h

Updated and extended test resources/coverage to validate:

  • typed proxy interaction with GenericSkeleton-created event storage,
  • slot access correctness,
  • and compatibility of the updated shared-memory interpretation model.

Validation performed

The following builds/tests were executed successfully after the changes:

bazel build //score/mw/com/impl/bindings/lola:proxy
bazel build //score/mw/com/impl/bindings/lola:skeleton

bazel test //score/mw/com/impl/bindings/lola:proxy_event_test
bazel test //score/mw/com/impl/bindings/lola:generic_proxy_event_test
bazel test //score/mw/com/impl/bindings/lola:skeleton_test
bazel test //score/mw/com/impl/bindings/lola:event_data_storage_test

All relevant LoLa event/proxy/skeleton tests passed successfully after the fix.

@rudresh-systream rudresh-systream force-pushed the bugfix/311-generic-skeleton-typed-proxy-storage branch from 37ebeb3 to dcb34ed Compare May 8, 2026 11:01
@crimson11
Copy link
Copy Markdown
Contributor

@rudresh-systream - thank you for your effort/fix proposal. @bemerybmw and me hat a quick 1st look. Looks promising as a short-term bugfix-solution! Will do a deeper review soon.

@ShoroukRamzy
Copy link
Copy Markdown
Contributor

Hi @rudresh-systream and @crimson11, I quickly reviewed this fix and also run our integration test #379 and all test cases passed for generic-typed/generic use cases.
Thanks for the PR!

Comment thread score/mw/com/impl/bindings/lola/skeleton_memory_manager.h Outdated
Comment thread score/mw/com/impl/bindings/lola/proxy_event.h Outdated
Comment thread score/mw/com/impl/bindings/lola/proxy_event.h Outdated
Comment thread score/mw/com/impl/bindings/lola/skeleton_memory_manager.h Outdated
@rudresh-systream rudresh-systream force-pushed the bugfix/311-generic-skeleton-typed-proxy-storage branch from f015563 to fcbca26 Compare May 21, 2026 09:58
@rudresh-systream rudresh-systream force-pushed the bugfix/311-generic-skeleton-typed-proxy-storage branch from c93dee0 to 29563be Compare May 22, 2026 08:59
@rudresh-systream
Copy link
Copy Markdown
Contributor Author

@crimson11
I removed the duplicate GenericSkeleton typed-proxy integration test/application from this PR and force-pushed the branch as a single squashed commit.

This PR is now focused on the #311 production fix and the remaining focused unit/regression coverage. I will check PR #379 for the GenericSkeleton integration-test coverage and add review feedback there if I find any missing Generic-Typed coverage relevant to this issue.

Comment thread score/mw/com/impl/bindings/lola/test/proxy_event_test_resources.h
Comment thread score/mw/com/impl/bindings/lola/test/proxy_event_test_resources.h Outdated
Comment thread score/mw/com/impl/bindings/lola/test/proxy_event_test_resources.cpp Outdated
Comment thread score/mw/com/impl/bindings/lola/test/proxy_event_test_resources.cpp Outdated
@rudresh-systream rudresh-systream force-pushed the bugfix/311-generic-skeleton-typed-proxy-storage branch from a94f24f to 642fdd7 Compare May 22, 2026 11:22
using LoLaTypedProxyEventTestFixture = LolaProxyEventFixture<ProxyEventStruct>;
TEST_F(LoLaTypedProxyEventTestFixture, ReadsSamplesFromEventMetaInfoRawStorage)
{
RecordProperty("Verifies", "SCR-6225206");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this RecordProperty added?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. This was only added as regression metadata for the temporary test and was not needed here. The added regression test now avoids extra RecordProperty entries and focuses only on the behavior being verified.

RecordProperty("TestType", "Regression test");
RecordProperty("DerivationTechnique", "Analysis of bug 311");

const std::size_t max_sample_count_subscription{5U};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All our tests should follow the given, when, then structure (e.g. https://martinfowler.com/bliki/GivenWhenThen.html)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I reworked the added regression test to follow the Given / When / Then structure explicitly.

The test now separates:

  • Given: a typed LoLa ProxyEvent subscribed to provider storage containing one sample
  • When: GetNewSamples is called
  • Then: the typed proxy receives the expected sample value

}

using LoLaTypedProxyEventTestFixture = LolaProxyEventFixture<ProxyEventStruct>;
TEST_F(LoLaTypedProxyEventTestFixture, ReadsSamplesFromEventMetaInfoRawStorage)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test name should clearly state what is the expectation of the test. e.g. In the title you say it should read samples from EventMetaInfoRawStorage but I don't see that tested in the test. Most likely that's just an implementation detail and we simply need to check that GetNewSamples works?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I renamed the test from the implementation-detail based name to:

GetNewSamplesReturnsTypedSampleFromProviderStorage

The test now describes the expected behavior instead of mentioning EventMetaInfo raw storage directly.

ProxyEventAttorney<TestSampleType> proxy_event_attorney{*test_proxy_event_};
using SamplesMemberType = typename std::remove_reference<decltype(proxy_event_attorney.GetSamplesMember())>::type;
static_assert(std::is_const<SamplesMemberType>::value, "Proxy should hold const slot data.");
using MetaInfoMemberType = typename std::remove_reference<decltype(proxy_event_attorney.GetMetaInfoMember())>::type;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add in a new test instead of modifying the current one.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I restored the existing SampleConstness test to stay focused on sample constness and added a separate test for the new EventMetaInfo const-member check.

The new test is named:

HoldsEventMetaInfoAsConstReference

const auto total_event_slots_size = safe_math::Multiply(aligned_sample_size, element_properties.number_of_slots);
if (!total_event_slots_size.has_value())
{
score::mw::log::LogFatal("lola") << "Could not calculate the event slots raw array size. Terminating.";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the others assertions and this is a terminate? You could also use the kAbortOnError return mode in safe_math

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I replaced the manual has_value() / LogFatal / terminate handling with safe_math::ReturnMode::kAbortOnError for the event-slot size calculation.

This keeps the behavior fail-fast while matching the surrounding assertion-style handling more closely.

auto& event_data_control_local = proxy_event_common_.GetConsumerEventDataControlLocal();

const auto event_slots_raw_array_size =
safe_math::Multiply(aligned_sample_size_, event_data_control_local.GetMaxSampleSlots());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also use the kAbortOnError return mode in safe_math

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I updated the ProxyEvent raw-slot size calculation to use safe_math::ReturnMode::kAbortOnError.

This removes the explicit result handling and keeps the constructor-time validation path simpler.

aligned_sample_size_{memory::shared::CalculateAlignedSize(sizeof(SampleType), alignof(SampleType))},
event_slots_raw_array_{nullptr}
{
InitialiseEventSlotsRawArray();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could return const std::uint8_t* and be used to directly initialize event_slots_raw_array_.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. InitialiseEventSlotsRawArray() now returns const std::uint8_t* and event_slots_raw_array_ is initialized directly in the constructor initializer list.

This removes the initial nullptr assignment plus later mutation pattern.

};

private:
void InitialiseEventSlotsRawArray() noexcept;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only mark destructors / move ops as noexcept

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I removed noexcept from InitialiseEventSlotsRawArray().

The destructor and move operations remain noexcept as before.

std::optional<ProviderEventDataControlLocalView<>> provider_event_data_control_local_{};
std::optional<ConsumerEventDataControlLocalView<>> consumer_event_data_control_local_{};
EventDataStorage<SampleType>* event_data_storage_{nullptr};
void* event_slots_raw_array_{nullptr};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't remove event_data_storage_. We should rather emulate the production code in which event_slots_raw_array_ is a type erased view into the event_data_storage_

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I restored event_data_storage_ in the proxy-event test fixture.

The fixture now emulates the production layout more closely: event_data_storage_ remains the typed backing storage, and event_slots_raw_array_ is only a type-erased view into event_data_storage_->data().

So the provider/test setup still behaves like typed skeleton storage, while the updated ProxyEvent consumes through the raw slot pointer exposed via EventMetaInfo.


template <typename SampleType>
inline std::tuple<EventControl*, EventDataStorage<SampleType>*> FakeMockedServiceData::AddEvent(
inline std::tuple<EventControl*, void*> FakeMockedServiceData::AddEvent(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be changed. Same point as: https://github.com/eclipse-score/communication/pull/394/changes#r3288235207

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I reverted this part.

FakeMockedServiceData::AddEvent() now keeps its original production-like signature and returns EventDataStorage<SampleType>* again instead of void*.

The raw slot pointer used by the proxy-event fixture is now derived from event_data_storage_->data(), so the type-erased view is created in the fixture without changing the fake service data API.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a descriptive but succinct summary of the changes to the commit message and remove "Signed-off-by"...

We also want to avoid merge commits in our prs. You can do the rebase locally or via github. e.g.

Image

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I rebased the branch locally onto latest origin/main, squashed the PR to a single commit, removed the merge commits, and removed the Signed-off-by trailer.

The final commit message is now descriptive and concise:

Fix typed proxy access to GenericSkeleton event storage

It includes a short summary explaining that typed ProxyEvent now accesses samples through EventMetaInfo raw slot storage instead of interpreting provider storage as EventDataStorage<SampleType>.

@rudresh-systream rudresh-systream force-pushed the bugfix/311-generic-skeleton-typed-proxy-storage branch from 1f6c681 to 95502db Compare May 25, 2026 06:23
@rudresh-systream
Copy link
Copy Markdown
Contributor Author

@bemerybmw

Summary of changes:

  • Reworked the added regression test to use a behavior-based name and Given / When / Then structure.
  • Removed unnecessary RecordProperty metadata from the added regression test.
  • Kept SampleConstness focused on sample constness and added a separate EventMetaInfo const-reference test.
  • Changed raw-slot size calculations to use safe_math::ReturnMode::kAbortOnError.
  • Changed InitialiseEventSlotsRawArray() to return const std::uint8_t* and initialize event_slots_raw_array_ directly.
  • Removed noexcept from InitialiseEventSlotsRawArray().
  • Added the requested TODO explaining that raw-slot access is a temporary compatibility fix until the LoLa binding layer is type-erased.
  • Restored FakeMockedServiceData::AddEvent() to return EventDataStorage*.
  • Restored event_data_storage_ in the proxy-event fixture and made event_slots_raw_array_ a type-erased view into event_data_storage_->data().
  • Rebased onto latest origin/main, squashed to one commit, removed merge commits, and removed the Signed-off-by trailer.

Validation:

  • bazel test //:format_test_C++_with_clang-format --test_output=all
  • bazel build //score/mw/com/impl/bindings/lola:proxy
  • bazel build //score/mw/com/impl/bindings/lola:skeleton
  • bazel test //score/mw/com/impl/bindings/lola:proxy_event_test
  • bazel test //score/mw/com/impl/bindings/lola:generic_proxy_event_test
  • bazel test //score/mw/com/impl/bindings/lola:skeleton_test
  • bazel test //score/mw/com/impl/bindings/lola:event_data_storage_test

Final branch state:

  • one commit on top of origin/main
  • no merge commits
  • no Signed-off-by trailer

@bemerybmw
Copy link
Copy Markdown
Contributor

@rudresh-systream Please check the failing tests. Once the CI is green, I'll re-review. Also, please don't resolve reviewer comments. You can write a comment saying that you've addressed the comment / add a thumbs up on the comment but it's up to the reviewer to resolve the comment once they've checked that you addressed it correctly.

Typed ProxyEvent now accesses event samples through EventMetaInfo raw slot storage instead of interpreting provider storage as EventDataStorage<SampleType>.

This keeps typed proxies compatible with GenericSkeleton-created storage while preserving typed skeleton behavior. Proxy event tests and fixtures were updated to emulate production typed storage with a type-erased EventMetaInfo view. Existing proxy event construction tests were aligned with the fixture sample type so EventMetaInfo validation matches the dummy provider storage.
@rudresh-systream rudresh-systream force-pushed the bugfix/311-generic-skeleton-typed-proxy-storage branch from 1a732be to 4bede76 Compare May 28, 2026 08:34
@rudresh-systream
Copy link
Copy Markdown
Contributor Author

@rudresh-systream Please check the failing tests. Once the CI is green, I'll re-review. Also, please don't resolve reviewer comments. You can write a comment saying that you've addressed the comment / add a thumbs up on the comment but it's up to the reviewer to resolve the comment once they've checked that you addressed it correctly.

@bemerybmw
Updated and force-pushed a CI fix.

The failing CI tests were constructing typed ProxyEvent instances with sample types that did not match the dummy skeleton storage created by ProxyMockedMemoryFixture. Since ProxyEvent now validates the EventMetaInfo sample size/alignment during construction, those tests aborted when the fixture-created provider metadata used std::uint32_t but the tests instantiated ProxyEvent<std::uint16_t> / ProxyEvent<std::uint8_t>.

I aligned the two affected construction tests with the fixture sample type:

  • proxy_event_common_test.cpp
  • proxy_event_field_binding_factory_test.cpp

Validation to run locally:

  • bazel test //score/mw/com/impl/bindings/lola:proxy_event_common_test --test_output=all
  • bazel test //score/mw/com/impl/plumbing:proxy_event_field_binding_factory_test --test_output=all
  • bazel test //... --build_tests_only

The branch is squashed back to one commit with no merge commits and no Signed-off-by trailer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants