-
Notifications
You must be signed in to change notification settings - Fork 22
MULTIPLY enable broadcasting #655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
_DataType_output* result = reinterpret_cast<_DataType_output*>(result_out); \ | ||
\ | ||
std::vector<size_t> result_shape = get_common_shape(input1_shape, input1_shape_ndim, \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe "get_result_shape" is better for this function name? not sure...
input2_shape, input2_shape_ndim); \ | ||
\ | ||
DPNPC_id<_DataType_input1> input1(input1_data, input1_shape, input1_shape_ndim); \ | ||
input1.broadcast(result_shape); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"broadcast_to_shape" func name?
gtests needs to be implemented to test "broadcast" feature in iterator (in separate file. not at the same place as reduce iterator). |
|
||
#include "dpnp_iterator.hpp" | ||
|
||
#define DPNP_LOCAL_QUEUE 1 // TODO need to fix build procedure and remove this workaround. Issue #551 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samir-nasibli it looks like it requires your attention
using dpnpc_index_t = dpnpc_it_t::size_type; | ||
|
||
template <typename _DataType> | ||
vector<_DataType> get_input_data(const vector<dpnpc_index_t>& shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code duplication. It looks like it is better to move this function into some common place for test suite.
} | ||
} | ||
|
||
TEST(TestBroadcastIterator, take_value_broadcast_loop_3D) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not clear why these tests are not with IteratorParameters
variadic tests
struct IteratorParameters | ||
{ | ||
vector<dpnpc_it_t::size_type> input_shape; | ||
vector<dpnpc_it_t::size_type> output_shape; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it defined or standardized somewhere? I mean, output_shape
as an input.
I see, at least, two approaches:
- we keep input internally:
input_shape
andoutput_shape
. Also we internally calculate step distances and other things to do iteration. - we keep input internally:
input_shape
andbroatcast_axis
. Also we internally calculateoutput_shape
, step distances and other things to do iteration.
"N 2" will match existed interface code (to function set_axes
) and, perhaps, required some extra parameter in ctor() like:
DPNPC_id<dpnpc_value_t, BROADCAST> result_obj(input_data.data(), {3, 4});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what complexity you are ready to implement and NumPy is required.
as far as I understand, we have two input shapes ("several" in general). Let's take:
shape1={2, 3, 4}
shape2={3, 4}
it looks like we can broadcast shape2
into shape1
by two steps
1. [3, 4] => [1, 3, 4]
2. [1, 3, 4] => [2, 3, 4]
Also, as I understand, shapes [3, 4] and [3, 1, 4, 1] have the same layout in memory.
I think everything depends on input parameters. If we have shapes as input - it is ok. I think having axis
to broadcast is more flexible and scalable solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From another perspective, it is NOT good idea.
Maybe better leave it as is but make function names more clear.
INSTANTIATE_TEST_SUITE_P( | ||
TestBroadcastIterator, | ||
IteratorBroadcasting, | ||
testing::Values( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Do you think it is good to add some combinations like "{1, 0}. {0}"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to add something like this IteratorParameters{{2, 0, 1}, {2, 0, 4}, {}}));
as test parameter, but in this case we have empty result. So SYCL kernel doesn't make sense here, because we iterate by result size that is equal to 0 in this case. In real case we should return from the function before submitting such kernel to the queue. Moreover we can see below issue on GPU when submitting such kernel to the queue
C++ exception with description "Native API failed. Native API returns: -30 (CL_INVALID_VALUE) -30 (CL_INVALID_VALUE)" thrown in the test body.
That is why 2 tests on reduction fail #661 (comment).
input1_it->~DPNPC_id(); \ | ||
input2_it->~DPNPC_id(); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed, we are responsible for destructing placed objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are obvious things that you are saying, but a little bit not about that.
The comment was not about freeing resources in general, it is about that we should avoid such explicitly call the destructor.
I see that current iterator implementation has flaws, updates for to the iterator interface are required for this.
In anyway this is not the current PR problem.
} | ||
|
||
input_it->~DPNPC_id(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to explicitly call the destructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed for placement new
.
This PR shows current status of broadcasting implementation through USM iterator.
Currently issue below is under investigation: