-
Notifications
You must be signed in to change notification settings - Fork 30
Enable building to target AMD GPUs #1731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
70b1e1c
to
45f3f28
Compare
Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞 |
Array API standard conformance tests for dpctl=0.18.0dev0=py310h15de555_104 ran successfully. |
Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_180 ran successfully. |
7b04098
to
c53fb26
Compare
Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_311 ran successfully. |
@ndgrigorian Regrettably use of Perhaps a preprocessor variable can be used to enable building for SPV/NVPTX or SPV/AMDGCN targets, but not for all three except after the bug gets fixed. It may be possible to write implementation of log1p to enable building for all three too. |
Yes, I only added the commit to make it convenient for the build failure to be reproduced. Writing our own implementation is possible, too. I think that would be preferable, but on the other hand, it's a corner case to build for both CUDA and AMD. |
c53fb26
to
8449aa8
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310hdf72452_164 ran successfully. |
8449aa8
to
729f0cf
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310hdf72452_202 ran successfully. |
729f0cf
to
2e66a2c
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310hdf72452_254 ran successfully. |
2e66a2c
to
449670e
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310hdf72452_261 ran successfully. |
I have successfully built dpctl for both CUDA and HIP simultaneously using this branch on a machine with CUDA and ROCm installed (no AMD devices, however—only Intel and NVidia). Errors came up while building, but I believe that these were caused by going OOM—errors showed up twice in I have marked this PR as ready for review, the CUDA segfault is resolved. |
Worth noting that I tried building with both |
Is this the command to use ?
|
Yes, and |
Array API standard conformance tests for dpctl=0.19.0dev0=py310hdf72452_264 ran successfully. |
99769d2
to
db487e8
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310hdf72452_264 ran successfully. |
Add DPCTL_TARGET_AMD cmake variable which is a string referring to the architecture to build for
Build segfault fixed in 2025.0
Selecting AMD devices uses the string "HIP" so this change maintains consistency
db487e8
to
c316380
Compare
Array API standard conformance tests for dpctl=0.19.0dev0=py310hdf72452_278 ran successfully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Thank you @ndgrigorian .
It makes sense to similarly allow DPCTL_TARGET_CUDA
to take non-boolean values to specify architecture in a future.
This PR modifies cmake scripts throughout dpctl to enable building for AMD. This is done by either setting the
DPCTL_TARGET_HIP
environment variable to the intended build architecture, or using-DDPCTL_TARGET_HIP
._dpctl_sycl_target_compile_options
and_dpctl_sycl_target_link_options
cmake lists are used to prevent branching logic in later scripts.