Add tests to cover scalar handling in `launch()` + Fix fp16 bug #669

leofang · 2025-06-01T04:44:41Z

Description

closes #260.

The test coverage revealed a bug in the fp16 scalar handling. The existing handling is obviously wrong and it's me to blame, but I honestly can't recall why I believed the code was correct 😞

Checklist

New or existing tests cover these changes.
The documentation is up to date with these changes.

copy-pr-bot · 2025-06-01T04:44:45Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

leofang · 2025-06-01T04:48:01Z

cuda_core/cuda/core/experimental/_kernel_arg_handler.pyx

+    elif supported_type is __half_raw:
+        (<supported_type*>ptr).x = <int16_t>(arg.view(numpy_int16))
    else:
        (<supported_type*>ptr)[0] = <supported_type>(arg)


Here is the bug: When arg is a np.float16 scalar, the old code would treat it as int16_t (due to lack of standard C++ identifier for fp16 before C++23), and the scalar would be static_cast to int16_t, which triggered non-trivial conversion operators. The new code ensures that there is no conversion and the bytes are reinterpret_cast'd, and in order to hit this new path we need a unique type identifier, which is __half_raw.

leofang · 2025-06-01T04:50:01Z

/ok to test f4852bd

leofang · 2025-06-01T06:23:44Z

/ok to test 63a68de

github-actions · 2025-06-02T15:21:48Z

Doc Preview CI
Preview removed because the pull request was closed or merged.

leofang added 2 commits June 1, 2025 04:40

fix fp16 scalar handling

5f2a88a

make linter happy

f4852bd

leofang self-assigned this Jun 1, 2025

leofang added bug Something isn't working P0 High priority - Must do! test Improvements or additions to tests cuda.core Everything related to the cuda.core module labels Jun 1, 2025

leofang added this to the cuda.core beta 4 milestone Jun 1, 2025

leofang commented Jun 1, 2025

View reviewed changes

leofang requested a review from rwgk June 1, 2025 04:53

This comment has been minimized.

Sign in to view

I always forget that the dlpack support was buggy before NumPy 2.1.0...

63a68de

kkraus14 approved these changes Jun 2, 2025

View reviewed changes

kkraus14 merged commit 064b9ea into NVIDIA:main Jun 2, 2025
102 of 103 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add tests to cover scalar handling in `launch()` + Fix fp16 bug #669

Add tests to cover scalar handling in `launch()` + Fix fp16 bug #669

Uh oh!

leofang commented Jun 1, 2025

Uh oh!

copy-pr-bot bot commented Jun 1, 2025

Uh oh!

leofang Jun 1, 2025 •

edited

Loading

Uh oh!

leofang commented Jun 1, 2025

Uh oh!

This comment has been minimized.

leofang commented Jun 1, 2025

Uh oh!

Uh oh!

github-actions bot commented Jun 2, 2025

Uh oh!

Uh oh!

Add tests to cover scalar handling in launch() + Fix fp16 bug #669

Add tests to cover scalar handling in launch() + Fix fp16 bug #669

Uh oh!

Conversation

leofang commented Jun 1, 2025

Description

Checklist

Uh oh!

copy-pr-bot bot commented Jun 1, 2025

Uh oh!

leofang Jun 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leofang commented Jun 1, 2025

Uh oh!

This comment has been minimized.

leofang commented Jun 1, 2025

Uh oh!

Uh oh!

github-actions bot commented Jun 2, 2025

Uh oh!

Uh oh!

Add tests to cover scalar handling in `launch()` + Fix fp16 bug #669

Add tests to cover scalar handling in `launch()` + Fix fp16 bug #669

leofang Jun 1, 2025 •

edited

Loading