You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implement kernels for in-place pow, remainder, and bitwise operators (#1447)
* Implements dedicated __ipow__ kernel
* Implements in-place remainder
* Implements in-place bitwise_and and bitwise_or
* Implements in-place bitwise_xor
* Implements in-place bitwise_left_shift and bitwise_right_shift
* Adds tests for in-place bitwise elementwise funcs
* Added tests for in-place remainder and pow
Fixed in-place remainder for devices that do not support 64-bit floating point data types
* Test commit splitting up elementwise functions
* Added missing includes of common_inplace
* Split elementwise functions into two more files and added them to the build
* Fix more missing includes
* Splits elementwise functions into separate source files
* Corrected numbers of elementwise functions
* Added missing vector include to elementwise function source files
Removed utility include
* Remove variable name in function declaration
* No need to import init functions into namespace, since they are defined in it
Removed "using dpctl::tensor::py_internal::init_abs`, since this imports `init_abs`
into the current namespace from `dpctl::tensor::py_internal`, but this namespace is
the current namespace and so the import is a no-op.
Also added brief docstring for the common init module.
* Changed use of "static inline" for utility functions
Instead, moved common functions into anonymous namespace as inline,
which is C++ way of expressing that multiple definitions of the same
function may exist in different C++ translation units, which linker
unifies.
* Moved inline functions into separate translation units
Instead of using inline keyword to allow multiple definitions of the same function
in different translation units, introduced elementwise_functions_type_utils.cpp
that defines these functions and a header file to use in other translatioon units.
This should reduce the binary size of the produced object files and simplify the
linker's job reducing the link-time.
* Added license header for 2 new files
---------
Co-authored-by: Oleksandr Pavlyk <oleksandr.pavlyk@intel.com>
0 commit comments