Skip to content

[NFC][DO NOT MERGE] Add scratch design for sycl-non-rdc- mode in NewOffloadModel#21858

Draft
maksimsab wants to merge 1 commit intointel:syclfrom
maksimsab:sycl-non-rdc-design
Draft

[NFC][DO NOT MERGE] Add scratch design for sycl-non-rdc- mode in NewOffloadModel#21858
maksimsab wants to merge 1 commit intointel:syclfrom
maksimsab:sycl-non-rdc-design

Conversation

@maksimsab
Copy link
Copy Markdown
Contributor

No description provided.

Comment thread NoRDCNewOffloadModel.md

The new SYCL offload model (`--offload-new-driver`) currently treats all device code as RDC (Relocatable Device Code): all TUs are merged in `clang-linker-wrapper` before post-link. `-fno-sycl-rdc` is a high-priority gap (P0) that enables per-TU self-contained device images, important for reducing link time, shipping precompiled device libraries, and scenarios where `SYCL_EXTERNAL` is not used.

The old model implements this via `shouldDoPerObjectFileLinking()` + per-TU device link steps during the compile phase in the driver. This plan implements the **linker-wrapper-side approach** as it is simpler, consistent with the new model's architecture, and can be done incrementally.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The driver changes are not covered by this proposal but based on the provided information I conclude that you propose clang-linker-wrapper to perform offload code linking during "link step". I might be wrong, but I think one of the requirements for -fno-gpu-rdc is to perform offload code linking at compile step. Please, check if this such requirement.

Anyway, this is big step away from the current SYCL workflow and upstream driver workflow for non-SYCL mode. -fno-gpu-rdc always links offload executable at compile step, not at the link step.

Comment thread NoRDCNewOffloadModel.md

**Implementation note:**
* Refactor the existing codegen loop (per-split SPIR-V/AOT) into a helper lambda/function so it can be reused for both RDC and non-RDC paths without duplication.
* clang-linker-wrapper already contains a suitable command line flag `--relocatable` from upstream which should be reused. It means that Clang Driver should start pass `--relocatable` for the existing RDC mode. Later, an absence of `--relocatable` flag will imply non-rdc sycl mode.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. --relocatable is not about RDC vs non-RDC compilation at all. It is about whether the output of the link needs its sections renamed for safe re-linking — an OpenMP static-library concern.
  2. When --relocatable is present, linkAndWrapDeviceFiles still runs (all TUs merged, the current RDC behavior for SYCL). The only change is the section renaming post-step. It cannot be used as a signal to
    switch to per-TU processing inside the wrapper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants