Skip to content

enable AMD GPU #406

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 6, 2023
Merged

enable AMD GPU #406

merged 5 commits into from
Apr 6, 2023

Conversation

vickytsang
Copy link
Contributor

No description provided.

Signed-off-by: Vicky Tsang <vtsang@amd.com>
@vickytsang vickytsang changed the title [WIP] enable AMD GPU enable AMD GPU Mar 15, 2023
@MMelQin MMelQin self-requested a review March 17, 2023 01:07
@MMelQin
Copy link
Collaborator

MMelQin commented Mar 17, 2023

Thank you @vickytsang for the pull request.

Do all AMD GPU device names contain the word "AMD"? I have no access to a AMD GPU, and it will be great if you can provide some reference. Also, for the same reason, I cannot test a built package targeting AMD GPU.

I have also left comments in the code, mostly on error handling.

@vickytsang
Copy link
Contributor Author

Thank you @vickytsang for the pull request.

Do all AMD GPU device names contain the word "AMD"? I have no access to a AMD GPU, and it will be great if you can provide some reference. Also, for the same reason, I cannot test a built package targeting AMD GPU.

I have also left comments in the code, mostly on error handling.

I've modified the implementation to use the rocminfo tool to identify the AMD target device. Below is an example of the output given an AMD GPU/rocm enabled system. The rocminfo will return an error if AMD GPU device driver is not loaded or "command not found" if this tool is missing.
Please give feedback on a preferred way to modify the associated unit tests.

=====================
/opt/rocm/bin/rocminfo
ROCk module is loaded
Able to open /dev/kfd read-write

HSA System Attributes

Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE

==========
HSA Agents


Agent 1


Name: AMD Ryzen 9 5950X 16-Core Processor
Uuid: CPU-XX
Marketing Name: AMD Ryzen 9 5950X 16-Core Processor
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 3400
BDFID: 0
Internal Node ID: 0
Compute Unit: 32
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 131896996(0x7dc96a4) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 131896996(0x7dc96a4) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
N/A


Agent 2


Name: gfx1030
Uuid: GPU-XX
Marketing Name: Device 73bf
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 4096(0x1000)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
Chip ID: 29631(0x73bf)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 2660
BDFID: 12544
Internal Node ID: 1
Compute Unit: 80
SIMDs per CU: 4
Shader Engines: 8
Shader Arrs. per Eng.: 2
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH
Fast F16 Operation: FALSE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 64(0x40)
Max Work-item Per CU: 2048(0x800)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 16760832(0xffc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1030
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***

Signed-off-by: Vicky Tsang <vtsang@amd.com>
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

Copy link
Collaborator

@MMelQin MMelQin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments have been addressed.

@MMelQin
Copy link
Collaborator

MMelQin commented Apr 6, 2023

We'll merge this PR as experimental support since

  • this only affects packager on what additional base image can be supported
  • the App SDK itself does not have explicit dependency on CUDA or GPU devices, rather, it is the (custom) operators and applications that may do, and the user shall ensure app works with ROCm and AMD GPU for packaging
  • App SDK Packager separately has been going through some major changes and will come from a underlying dependency package in the next release of the App SDK.

@MMelQin MMelQin merged commit f62199b into Project-MONAI:main Apr 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants