[SYCL][Doc] Updates to the "root group" extension#21838
[SYCL][Doc] Updates to the "root group" extension#21838gmlueck wants to merge 4 commits intointel:syclfrom
Conversation
* Change the `use_root_sync` property from a "kernel property" to a "kernel launch property". This is necessary because we want it to be possible to determine at runtime on a per-launch basis whether a kernel is launched in the special way that allows root-group synchronization. Kernels are allowed to statically contain a call to `group_barrier` with `root_group` even if they are not launched this way. However, the kernel must only dynamically call `group_barrier` with `root_group` if it is launched in the special way. This behavior is not possible if `use_root_sync` is a "kernel property" because kernel properties are the immutable from launch to launch. * No longer depend on "sycl_ext_oneapi_launch_queries" for the query that tells the maximum number of work-groups when using root-group synchronization. Instead, add a new kernel information descriptor `max_num_work_groups_sync` and new overloads of `kernel::get_info` that provide this information. We decided that the generality of "sycl_ext_oneapi_launch_queries" was overkill. * Add shortcut functions that allow an application to query `max_num_work_groups_sync` without first getting a kernel bundle. This is similar to existing shortcuts we provide already via "sycl_ext_oneapi_get_kernel_info". Add these shortcuts both for kernels defined with a type-name and for kernels defined as free-function kernels.
According to the Level Zero team, launch properties like `cache_config` can also affect the maximum number of work-groups that are allowed when doing root-group synchronization. Therefore, add a `LaunchProperties` parameter to the query, and require the application to pass the list of kernel launch properties.
When an application uses "sycl_ext_oneapi_work_group_scratch_memory" to allocate its dynamic work-group local memory, it can pass the size of that memory more conveniently via `props`. Add wording to clarify that this is allowed. When applications do this, `bytes` is normally zero. In order to make application code less verbose in this case, switch the parameter order so that `bytes` is last. This way, applications can allow it to be defaulted, rather than passing an explicit `0`.
|
How does the I did a quick repository scan but could not find some piece of logic connecting the two. The adapter API that is called for this query does not seem to take into account any launch parameter, except for the workspace dimensions and the work group scratch size, if they happen to be passed through these launch properties. |
According to @MichalMrozek:
Therefore, I think the SYC runtime needs to pass more information down to the unified runtime when it calls |
Change the
use_root_syncproperty from a "kernel property" to a "kernel launch property". This is necessary because we want it to be possible to determine at runtime on a per-launch basis whether a kernel is launched in the special way that allows root-group synchronization. Kernels are allowed to statically contain a call togroup_barrierwithroot_groupeven if they are not launched this way. However, the kernel must only dynamically callgroup_barrierwithroot_groupif it is launched in the special way. This behavior is not possible ifuse_root_syncis a "kernel property" because kernel properties are the immutable from launch to launch.No longer depend on "sycl_ext_oneapi_launch_queries" for the query that tells the maximum number of work-groups when using root-group synchronization. Instead, add a new kernel information descriptor
max_num_work_groups_syncand new overloads ofkernel::get_infothat provide this information. We decided that the generality of "sycl_ext_oneapi_launch_queries" was overkill.Change the query so that it take the set of "launch properties". This is necessary because some kernel launch properties like
cache_configcan affect the result of the query.Add shortcut functions that allow an application to query
max_num_work_groups_syncwithout first getting a kernel bundle. This is similar to existing shortcuts we provide already via "sycl_ext_oneapi_get_kernel_info". Add these shortcuts both for kernels defined with a type-name and for kernels defined as free-function kernels.