Description
Description
For historical reasons pt.min
returns pt.neg(pt.max(pt.neg(x)))
. This seems to have been the case mostly to avoid having to define Min
Op and its gradient. There is a later "uncanonicalize" phase that converts those expressions to min
, suggesting we prefer them, but don't put it in place because of the lack of gradient.
We should reassess this as is adds some unwelcome complexity. The L_op
implementation (cleaned up in #901) works directly for min. R_op
, on the other hand uses Argmax (not sure why this is needed in the forward but not backward pass, CC @aseyboldt), so a similar Min.R_op
may need to use Argmin
. Similar to Min
that's currently implemented as Argmax of negative of x, which is probably fine? We could also consider a direct Argmin
but that is not as ubiquitous and hence less annoying.