Enable building to target AMD GPUs #1731

ndgrigorian · 2024-07-13T22:54:51Z

This PR modifies cmake scripts throughout dpctl to enable building for AMD. This is done by either setting the DPCTL_TARGET_HIP environment variable to the intended build architecture, or using -DDPCTL_TARGET_HIP.

_dpctl_sycl_target_compile_options and _dpctl_sycl_target_link_options cmake lists are used to prevent branching logic in later scripts.

Have you provided a meaningful PR description?
Have you added a test, reproducer or referred to an issue with a reproducer?
Have you tested your changes locally for CPU and GPU devices?
Have you made sure that new changes do not introduce compiler warnings?
Have you checked performance impact of proposed changes?
If this PR is a work in progress, are you opening the PR as a draft?

github-actions · 2024-07-13T23:28:47Z

Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞

coveralls · 2024-07-13T23:35:09Z

coverage: 87.668% (-0.06%) from 87.725%
when pulling db487e8 on feature/enable-amd-builds
into 691c225 on master.

github-actions · 2024-07-14T19:24:58Z

Array API standard conformance tests for dpctl=0.18.0dev0=py310h15de555_104 ran successfully.
Passed: 894
Failed: 15
Skipped: 105

github-actions · 2024-07-29T15:03:19Z

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_180 ran successfully.
Passed: 893
Failed: 2
Skipped: 119

github-actions · 2024-08-15T18:58:38Z

Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_311 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

oleksandr-pavlyk · 2024-08-17T18:42:51Z

@ndgrigorian Regrettably use of sycl::log1p change necessary to enable compiling for AMD breaks compiling for CUDA.

Perhaps a preprocessor variable can be used to enable building for SPV/NVPTX or SPV/AMDGCN targets, but not for all three except after the bug gets fixed. It may be possible to write implementation of log1p to enable building for all three too.

ndgrigorian · 2024-08-17T21:43:36Z

@ndgrigorian Regrettably use of sycl::log1p change necessary to enable compiling for AMD breaks compiling for CUDA.

Perhaps a preprocessor variable can be used to enable building for SPV/NVPTX or SPV/AMDGCN targets, but not for all three except after the bug gets fixed. It may be possible to write implementation of log1p to enable building for all three too.

Yes, I only added the commit to make it convenient for the build failure to be reproduced.

Writing our own implementation is possible, too. I think that would be preferable, but on the other hand, it's a corner case to build for both CUDA and AMD.

github-actions · 2024-10-24T20:46:49Z

Array API standard conformance tests for dpctl=0.19.0dev0=py310hdf72452_164 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

github-actions · 2024-11-11T22:21:27Z

Array API standard conformance tests for dpctl=0.19.0dev0=py310hdf72452_202 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

github-actions · 2024-11-20T23:12:56Z

Array API standard conformance tests for dpctl=0.19.0dev0=py310hdf72452_254 ran successfully.
Passed: 895
Failed: 0
Skipped: 119

github-actions · 2024-11-21T03:35:26Z

Array API standard conformance tests for dpctl=0.19.0dev0=py310hdf72452_261 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

ndgrigorian · 2024-11-21T04:20:14Z

@oleksandr-pavlyk @antonwolfy

I have successfully built dpctl for both CUDA and HIP simultaneously using this branch on a machine with CUDA and ROCm installed (no AMD devices, however—only Intel and NVidia).

Errors came up while building, but I believe that these were caused by going OOM—errors showed up twice in prod.cpp, once in sum.cpp, and once in copy_and_cast_usm_to_usm.cpp. prod, sum, etc. seemed to work on both level-zero and CUDA without a problem, and tests passed. Even in verbose mode, the message wasn't especially helpful.

I have marked this PR as ready for review, the CUDA segfault is resolved.

ndgrigorian · 2024-11-21T04:28:31Z

@oleksandr-pavlyk @antonwolfy

I have successfully built dpctl for both CUDA and HIP simultaneously using this branch on a machine with CUDA and ROCm installed (no AMD devices, however—only Intel and NVidia).

Errors came up while building, but I believe that these were caused by going OOM—errors showed up twice in prod.cpp, once in sum.cpp, and once in copy_and_cast_usm_to_usm.cpp. prod, sum, etc. seemed to work on both level-zero and CUDA without a problem, and tests passed. Even in verbose mode, the message wasn't especially helpful.

I have marked this PR as ready for review, the CUDA segfault is resolved.

Worth noting that I tried building with both DPCTL_TARGET_HIP=gfx1100 and DPCTL_TARGET_HIP=gfx1030.

oleksandr-pavlyk · 2024-11-21T16:35:11Z

Is this the command to use ?

python scripts/build_locally.py --verbose --cmake-opts="-DDPCTL_TARGET_HIP=gfx1030"

ndgrigorian · 2024-11-21T17:33:37Z

Is this the command to use ?

python scripts/build_locally.py --verbose --cmake-opts="-DDPCTL_TARGET_HIP=gfx1030"

Yes, and DPCTL_TARGET_HIP=gfx1030 python scripts/build_locally.py --verbose should work too

github-actions · 2024-11-21T23:59:11Z

Array API standard conformance tests for dpctl=0.19.0dev0=py310hdf72452_264 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

github-actions · 2024-11-22T00:48:20Z

Array API standard conformance tests for dpctl=0.19.0dev0=py310hdf72452_264 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

Add DPCTL_TARGET_AMD cmake variable which is a string referring to the architecture to build for

Build segfault fixed in 2025.0

Selecting AMD devices uses the string "HIP" so this change maintains consistency

…uide

github-actions · 2024-11-22T19:57:04Z

Array API standard conformance tests for dpctl=0.19.0dev0=py310hdf72452_278 ran successfully.
Passed: 894
Failed: 1
Skipped: 119

oleksandr-pavlyk

Looks good to me! Thank you @ndgrigorian .

It makes sense to similarly allow DPCTL_TARGET_CUDA to take non-boolean values to specify architecture in a future.

ndgrigorian requested a review from oleksandr-pavlyk July 13, 2024 22:54

ndgrigorian force-pushed the feature/enable-amd-builds branch from 70b1e1c to 45f3f28 Compare July 13, 2024 22:59

oleksandr-pavlyk changed the title ~~Enable building for AMD~~ Enable building to target AMD GPUs Aug 7, 2024

ndgrigorian force-pushed the feature/enable-amd-builds branch from 7b04098 to c53fb26 Compare August 15, 2024 18:16

ndgrigorian force-pushed the feature/enable-amd-builds branch from c53fb26 to 8449aa8 Compare October 24, 2024 20:05

ndgrigorian force-pushed the feature/enable-amd-builds branch from 8449aa8 to 729f0cf Compare November 11, 2024 21:40

ndgrigorian force-pushed the feature/enable-amd-builds branch from 729f0cf to 2e66a2c Compare November 20, 2024 22:29

ndgrigorian marked this pull request as ready for review November 21, 2024 01:23

ndgrigorian force-pushed the feature/enable-amd-builds branch from 2e66a2c to 449670e Compare November 21, 2024 02:51

ndgrigorian requested a review from antonwolfy November 21, 2024 04:20

ndgrigorian force-pushed the feature/enable-amd-builds branch from 99769d2 to db487e8 Compare November 22, 2024 00:04

ndgrigorian added 2 commits November 22, 2024 11:11

Enable CMake options for building for AMD

ab7cf09

Add DPCTL_TARGET_AMD cmake variable which is a string referring to the architecture to build for

Refactor cmake for setting AMD targets and fix incorrect logic

9fa6054

ndgrigorian added 4 commits November 22, 2024 11:11

Remove use of std::log1p in math_utils.hpp

42f8ace

Build segfault fixed in 2025.0

Change DPCTL_TARGET_AMD to DPCTL_TARGET_HIP

a559d10

Selecting AMD devices uses the string "HIP" so this change maintains consistency

Implement HIP backend

ac6cc7b

Add documentation on building for AMD devices to dpctl installation g…

c316380

…uide

ndgrigorian force-pushed the feature/enable-amd-builds branch from db487e8 to c316380 Compare November 22, 2024 19:12

oleksandr-pavlyk approved these changes Nov 24, 2024

View reviewed changes

ndgrigorian merged commit eefc82b into master Nov 24, 2024
52 of 54 checks passed

ndgrigorian deleted the feature/enable-amd-builds branch November 24, 2024 20:01

Enable building to target AMD GPUs #1731

Enable building to target AMD GPUs #1731

Uh oh!

Conversation

ndgrigorian commented Jul 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jul 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Jul 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jul 14, 2024

Uh oh!

github-actions bot commented Jul 29, 2024

Uh oh!

github-actions bot commented Aug 15, 2024

Uh oh!

oleksandr-pavlyk commented Aug 17, 2024

Uh oh!

ndgrigorian commented Aug 17, 2024

Uh oh!

github-actions bot commented Oct 24, 2024

Uh oh!

github-actions bot commented Nov 11, 2024

Uh oh!

github-actions bot commented Nov 20, 2024

Uh oh!

github-actions bot commented Nov 21, 2024

Uh oh!

ndgrigorian commented Nov 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ndgrigorian commented Nov 21, 2024

Uh oh!

oleksandr-pavlyk commented Nov 21, 2024

Uh oh!

ndgrigorian commented Nov 21, 2024

Uh oh!

github-actions bot commented Nov 21, 2024

Uh oh!

github-actions bot commented Nov 22, 2024

Uh oh!

github-actions bot commented Nov 22, 2024

Uh oh!

oleksandr-pavlyk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ndgrigorian commented Jul 13, 2024 •

edited

Loading

github-actions bot commented Jul 13, 2024 •

edited

Loading

coveralls commented Jul 13, 2024 •

edited

Loading

ndgrigorian commented Nov 21, 2024 •

edited

Loading