Refactor gradient related methods

Right now we have `grad`, `L_op`, `R_op`.

## Deprecate `grad` in favor of `L_op`:

`grad` is exactly the same as `L_op` except it doesn't have access to the outputs of the node that is being differentiated.

https://github.com/pymc-devs/pytensor/blob/24b67a860b6a3d38e9f23505800c4d2af0aee852/pytensor/graph/op.py#L366-L393

`L_op` allows one to reuse the same output when it's needed in the gradient, which means there is one less node to be merged during compilation. This is mostly relevant for nodes that are costly to merge such as Scan (see https://github.com/pymc-devs/pytensor/commit/0f5a06d7f7bd8480d52d1d2b754bffd7abd14f21).

It also saves time spent on `make_node` (e.g., inferring static type shapes). In the Scalar Ops it's used everywhere to quickly check if the output types are discrete (see https://github.com/pymc-devs/pytensor/commit/fd628c5a74adbfcdc72bf7362ffab07a7b7c0cd6). There are some opportunities still missing, for example, the gradient of `Exp`:

https://github.com/pymc-devs/pytensor/blob/24b67a860b6a3d38e9f23505800c4d2af0aee852/pytensor/scalar/basic.py#L3096-L3107

Could instead return `(gz * outputs[0],) `

More importantly for this issue, I think we should deprecate `grad` completely, since everything can be equally well done with `L_op`.

## Rename `L_op` and `R_op`?

The names are pretty non-intuitive, and I don't think they are used in any other auto-diff libraries. The equivalents in JAX are `vjp` and `jvp` (you can find direct translation in https://www.pymc-labs.io/blog-posts/jax-functions-in-pymc-3-quick-examples/)

Other suggestions were discussed some time ago by Theano devs here: https://groups.google.com/g/theano-dev/c/8-z2C59rmQk/m/gm432ifVAg0J?pli=1

## Remove `R_op` in favor of double application of `L_op` (or make it a default fallback)

There was some fanfare sometime ago about `R_op` being completely redundant in a framework with dead code elimination: https://github.com/Theano/Theano/issues/6035

That thread suggests also the double `L_op` may generate more efficient graphs in some cases (because most of our rewrites target the type of graphs generated by `L_op`?)

It probably makes sense to retain the `R_op` for cases where we/users know that's the best approach but perhaps default/revert to double `L_op` otherwise. Stale PRs that never quite got into Theano:

https://github.com/Theano/Theano/pull/6400
https://github.com/Theano/Theano/pull/6037






	def L_op(
	self,
	inputs: Sequence[Variable],
	outputs: Sequence[Variable],
	output_grads: Sequence[Variable],
	) -> List[Variable]:
	r"""Construct a graph for the L-operator.

	The L-operator computes a row vector times the Jacobian.

	This method dispatches to :meth:`Op.grad` by default. In one sense,
	this method provides the original outputs when they're needed to
	compute the return value, whereas `Op.grad` doesn't.

	See `Op.grad` for a mathematical explanation of the inputs and outputs
	of this method.

	Parameters
	----------
	inputs
	The inputs of the `Apply` node using this `Op`.
	outputs
	The outputs of the `Apply` node using this `Op`
	output_grads
	The gradients with respect to each `Variable` in `inputs`.

	"""
	return self.grad(inputs, output_grads)

	def L_op(self, inputs, outputs, gout):
	(x,) = inputs
	(gz,) = gout
	if x.type in complex_types:
	raise NotImplementedError()
	if outputs[0].type in discrete_types:
	if x.type in discrete_types:
	return [x.zeros_like(dtype=config.floatX)]
	else:
	return [x.zeros_like()]

	return (gz * exp(x),)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor gradient related methods #182

Deprecate `grad` in favor of `L_op`:

Rename `L_op` and `R_op`?

Remove `R_op` in favor of double application of `L_op` (or make it a default fallback)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Refactor gradient related methods #182

Description

Deprecate grad in favor of L_op:

Rename L_op and R_op?

Remove R_op in favor of double application of L_op (or make it a default fallback)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Deprecate `grad` in favor of `L_op`:

Rename `L_op` and `R_op`?

Remove `R_op` in favor of double application of `L_op` (or make it a default fallback)