-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fall back to SYCL 2020 API for a generic SYCL implementation #1954
Conversation
c0fa332
to
91d07bc
Compare
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_radix_sort.h
Outdated
Show resolved
Hide resolved
Turning into draft to make the approach more scalable. |
71e5673
to
a0fc677
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach looks good to me in principle, but I have not checked thoroughly that all the new definitions match the old ones.
f321c2c
to
865b88a
Compare
@MikeDvorskiy could you also take a look at the PR? The macros can be tricky and prone to subtle issues, so the second reviewer can help a lot. |
# define _ONEDPL_SYCL_REQD_SUB_GROUP_SIZE(SIZE) sycl::reqd_sub_group_size(SIZE) | ||
#else | ||
#elif _ONEDPL_LIBSYCL_VERSION_LESS_THAN(50300) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if both macros are evaluated to zero? (Here and in the others similar places below)
- If it is possible, we don't control the all cases and get an error - undefined
_ONEDPL_SYCL_REQD_SUB_GROUP_SIZE
... - If it is not possible, I don't understand why do we write the second if -
#elif
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course we do not control all the cases, because we cannot predict what an arbitrary SYCL implementation can or cannot do, and what workaround it would require if something is not supported.
The optimistic expectation is that fully conformant SYCL implementations go into the "present" branch with no effort from our side. However if for some implementation that's not true, the case (1) will happen and we will need to analyze the failure and provide a proper workaround, likely in another #elif branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically for _ONEDPL_SYCL_REQD_SUB_GROUP_SIZE
, we can define it as no-op in the #else branch. But this might result in runtime errors for oneDPL kernels that really require a certain WG size, which would be even worse than the guaranteed compilation error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another approach is to switch from "present" macros to "workaround" macros. I think Dmitry tried something of that kind with "absent" macros in a previous revision of the patch. It would be something like this:
#if _ONEDPL_LIBSYCL_REQD_SUB_GROUP_SIZE_WA // == _ONEDPL_LIBSYCL_VERSION_LESS_THAN(50300)
# define _ONEDPL_SYCL_REQD_SUB_GROUP_SIZE(SIZE) intel::reqd_sub_group_size(SIZE)
#else // SYCL 2020 conformance expected
# define _ONEDPL_SYCL_REQD_SUB_GROUP_SIZE(SIZE) sycl::reqd_sub_group_size(SIZE)
#endif
For an arbitrary implementation of SYCL it is no better than the proposed way, because if sycl::reqd_sub_group_size
is not supported, there will also be a compilation error. And while the code appears to be slightly simpler, a downside is that the conformant code comes last, not first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me add more details on top of Alexey's answer.
What if both macros are evaluated to zero... If it is possible
It is possible.
Let's consider such a case: older versions of FOOSYCL, some SYCL library, do not support sycl::reqd_sub_group_size
. There is no workaround, or it was simply forgotten to be added.
#define _ONEDPL_SYCL2020_REQD_SUB_GROUP_SIZE_PRESENT \
(!_ONEDPL_LIBSYCL_VERSION_LESS_THAN(50300) && !_ONEDPL_FOOSYCL_VERSION_LESS_THAN(42))
// Current approach: older versions of FOOSYCL will have undefined macro, which may have unpredictable consequences
#if _ONEDPL_SYCL2020_REQD_SUB_GROUP_SIZE_PRESENT
# define _ONEDPL_SYCL_REQD_SUB_GROUP_SIZE(SIZE) sycl::reqd_sub_group_size(SIZE)
#elif _ONEDPL_LIBSYCL_VERSION_LESS_THAN(50300)
# define _ONEDPL_SYCL_REQD_SUB_GROUP_SIZE(SIZE) intel::reqd_sub_group_size(SIZE)
#endif
// Possible approach: error will be emitted with the older FOOSYCL versions, but it is more verbose
#if _ONEDPL_SYCL2020_REQD_SUB_GROUP_SIZE_PRESENT
# define _ONEDPL_SYCL_REQD_SUB_GROUP_SIZE(SIZE) sycl::reqd_sub_group_size(SIZE)
#elif _ONEDPL_LIBSYCL_VERSION_LESS_THAN(50300)
# define _ONEDPL_SYCL_REQD_SUB_GROUP_SIZE(SIZE) intel::reqd_sub_group_size(SIZE)
#else
# error "No implementation for reqd_sub_group_size"
#endif
Was it your concern, Mikhail? If so, would you prefer "Possible approach" above? It's an edge-case, I am reluctant to do that across the PR due to the verbosity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can define it as no-op in the #else branch
We can also define #pragma error
in the #else branch...
I think compile time error is better then runtime error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if
sycl::reqd_sub_group_size
is not supported, there will also be a compilation error
Ok, this behavior is also acceptable for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# error "No implementation for reqd_sub_group_size"
Yes, this approach is acceptable to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an edge-case, I am reluctant to do that across the PR due to the verbosity.
Sorry, but it seems I didn't catch what you want to say...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also define #pragma error in the #else branch...
I think compile time error is better then runtime error.
Done.
It's an edge-case, I am reluctant to do that across the PR due to the verbosity.
It was about proliferation of "#else... #error ..." parts, which I assumed as non-necessary. It is not relevant, I changed my mind: it is better to have these parts to avoid UB in some cases.
Signed-off-by: Dmitriy Sobolev <[email protected]>
Signed-off-by: Dmitriy Sobolev <[email protected]>
…N_LESS_THAN(version)
This reverts commit 4990728.
ae90d1e
to
5a9f7fd
Compare
inline constexpr all_host_view_fn | ||
#else | ||
#elif _ONEDPL_LIBSYCL_VERSION_LESS_THAN(60200) | ||
inline constexpr all_view_fn<sycl::access::mode::read_write, __dpl_sycl::__host_target, | ||
sycl::access::placeholder::false_t> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#else # error ...
?
sycl::property::no_init; | ||
#else | ||
#elif _ONEDPL_LIBSYCL_VERSION_LESS_THAN(50300) | ||
sycl::property::noinit; | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#else #error .....
branch ?
using __sub_group = sycl::sub_group; | ||
#else | ||
#elif _ONEDPL_LIBSYCL_VERSION_LESS_THAN(50700) | ||
using __sub_group = sycl::ONEAPI::sub_group; | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#else #error .....
branch ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
return __buffer.size(); | ||
#else | ||
#elif _ONEDPL_LIBSYCL_VERSION_LESS_THAN(50300) | ||
return __buffer.get_count(); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#else #error .....
branch ?
Here and in the rest same places below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Let's assume that a generic SYCL library is compliant to SYCL 2020 and fallback to that API.
Below is a specific example of the issue with
sycl::property::no_init
. Other places fixed in the PR are essentially the same.Before:
After:
Meanwhile, the conditions relying on DPC++ extensions remained the same, e.g. FPGA support or ESIMD.
Additional changes:
Some macros were renamed to have SYCL2020 or LIBSYCL in their names to denote their purpose (either SYCL 2020 feature or dependency on DPC++ SYCL implementation/extension).
_ONEDPL_SYCL2020_UNIFIED_USM_BUFFER_PRESENT
was renamed to_ONEDPL_SYCL2020_DEFAULT_ACCESSOR_CONSTRUCTOR_PRESENT
to show the exact SYCL 2020 feature needed to enable code section._ONEDPL_SYCL_L0_EXT_PRESENT
was introduced for getting L0 backend with different SYCL implementations.