[quant] Add IR related recommendations for Quantizer tutorial #2608

jerryzh168 · 2023-10-16T20:05:18Z

Summary:
att

Test Plan:
CI

Reviewers:

Subscribers:

Tasks:

Tags:

Fixes #ISSUE_NUMBER

Description

Checklist

The issue that is being fixed is referred in the description (see above "Fixes #ISSUE_NUMBER")
Only one issue is addressed in this pull request
Labels from the issue that this PR is fixing are added to this pull request
No unnecessary issues are included into this pull request.

pytorch-bot · 2023-10-16T20:05:22Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2608

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 970227a with merge base 4e25f97 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: att Test Plan: CI Reviewers: Subscribers: Tasks: Tags:

svekars

Some editorial nits.

prototype_source/pt2e_quantizer.rst

svekars · 2023-10-18T20:02:41Z

Can you please fix the pyspelling job?

svekars · 2023-10-18T20:04:46Z

Can someone please take a look? @ezyang @andrewor14 @SherlockNoMad @kimishpatel

andrewor14

Looks good. Just a few questions about the ordering / how this relates to our existing XNNPACKQuantizer.

prototype_source/pt2e_quantizer.rst

andrewor14 · 2023-10-18T20:31:45Z

prototype_source/pt2e_quantizer.rst

+      conv = torch.nn.functional.conv2d(x, weight, bias)
+      relu = torch.nn.functional.relu(conv)
+      # returns an additional dict that includes a map from name to node that we want to annotate
+      return relu, {"conv": conv, "relu": relu, "x": x, "weight": weight, "bias": bias}


Did we resolve the weight issue? Can we return an nn.Parameter here?

not yet, nn.Parameter is not supported. also I haven't tested if this would work for linear modules yet actually.

andrewor14 · 2023-10-18T20:34:51Z

prototype_source/pt2e_quantizer.rst

+      # annotate the nodes
+      inputs, output = _find_input_and_output(match)
+      inputs[0].users[0].meta["quantization_annotation"] = ...
+      inputs[1].users[0].meta["quantization_annotation"] = ...


Actually from reading the example (2) doesn't seem super user friendly. Should we put (3) SubgraphMatcherWithNameNodeMap at the top as the recommended way? I feel it'll handle 90% of the patterns we want to match. For something like LSTM or MHA maybe we need something more advanced like (1) and (2)

yeah I was thinking of having (3) as the recommended one, but there is actually an caveat that we are assuming some ops not being traced into multiple ops (e.g. F.linear, F.conv2d etc.)

jerryzh168 · 2023-10-18T22:20:17Z

Can you please fix the pyspelling job?

@svekars I'm not sure how to fix these actually: https://github.com/pytorch/tutorials/actions/runs/6539478971/job/17757612269?pr=2608

kimishpatel · 2023-10-19T15:39:50Z

prototype_source/pt2e_quantizer.rst

+
+2. Using ``SubgraphMatcher``
+--------------------------------
+Because of this, we recommend people to recognize the pattern through ``SubgraphMatcher``, through capturing a ``torch`` IR pattern (with the same program capture used for capturing the floating point model), instead of using the ``aten`` IR pattern directly.


who are the "people" here?

We should really just suggest alternative, rather than recommendation. When we say that quantization targets pre-dispatch aten IR, that is our contract. So the fact that, in some cases, pre-dispatch aten IR is not same as torch IR is not really quantization workflow problem but part of the export. That is the reason I feel we should really provide "examples of pattern matching". I feel recommendation feels too strong, given all the drawbacks listed below.

"people" means backend developers.

The motivation for giving a recommendation is introduced in L338, basically trying to make sure the pattern matching is robust against pytorch code changes. If we don't give the recommendation I think modeling users will still expect the same python model + same quantization code + same quantizer will give the same quantized model when they update their pytorch version, which may not be true if quantizer writers are not cautious about this

Thats fair. My main concern is with "recommendation". I would have presented alternatives with pros and cons.

kimishpatel · 2023-10-19T15:40:07Z

prototype_source/pt2e_quantizer.rst

+
+2. Using ``SubgraphMatcher``
+--------------------------------
+Because of this, we recommend people to recognize the pattern through ``SubgraphMatcher``, through capturing a ``torch`` IR pattern (with the same program capture used for capturing the floating point model), instead of using the ``aten`` IR pattern directly.


also is torch IR defined anywhere?

probably not officially defined, L307 gives some examples

prototype_source/pt2e_quantizer.rst

kimishpatel · 2023-10-19T15:52:08Z

prototype_source/pt2e_quantizer.rst

+
+If people are uncertain if some functional operator is going to be traced into multiple aten operators then they can pick option 2.
+
+We would not recommend option 1. since that's the least stable in all three options.


Against this is a very very storng statement which does not hold true in practice. If anything, in practice 1 is a better option because torch IR for the most part match pre-dispatch aten IR.

I don't understand why option 1 would be better than torch IR? what is the criteria here? given the motivation in L338 I think option 1 is the least stable of all 3.

i understand the stability argument. My point is that in practice torch IR is very close to aten IR. And if you dont recommend it, I would just remove option 1 and just have this doc be "how-to-pattern-match for quantizers"

OK, I'll present this in a different way, this is also a motivation of why we want to have other ways of matching

kimishpatel

left some comments

prototype_source/pt2e_quantizer.rst

ezyang · 2023-10-20T20:09:59Z

Modulo disagreements with @kimishpatel, the doc seems fine, but you should workshop it with one of the potential users and check if the APIs actually are sufficient for them in practice. User feedback matters.

kimishpatel · 2023-10-20T20:25:25Z

Modulo disagreements with @kimishpatel, the doc seems fine, but you should workshop it with one of the potential users and check if the APIs actually are sufficient for them in practice. User feedback matters.

lol. Please do tell me where you disagree though

jerryzh168 · 2023-10-20T21:59:15Z

Modulo disagreements with @kimishpatel, the doc seems fine, but you should workshop it with one of the potential users and check if the APIs actually are sufficient for them in practice. User feedback matters.

lol. Please do tell me where you disagree though

I think Ed meant the disagreement between you and me on what options to recommend. I plan to restructure this and use option 1 as motivation and remove option 2, please let me know if there are other concerns

leslie-fang-intel · 2023-10-20T23:48:01Z

prototype_source/pt2e_quantizer.rst

+
+With this, the ``Quantizer`` will still be valid even when the implementation for nn modules and functionals changes, the ``aten`` IR for floating point model will change, but since we capture the pattern again instead of hardcoding the ``aten`` IR for the pattern, we'll get the updated ``aten`` IR as well and will still be able to match the pattern.
+
+One caveat is that if inputs of the pattern has multiple users, we don't have a good way to identify which user node we want to annotate except for checking the aten op target.


Not sure if I understand this issue clearly?

One caveat is that if inputs of the pattern has multiple users,
Are we talking about the case that a input has been used by multiple different patterns? or the case that a input has been used by multiple nodes inside one pattern?

yes that's correct

facebook-github-bot added the cla signed label Oct 16, 2023

[quant] Add IR related recommendations for Quantizer tutorial

c967f1d

Summary: att Test Plan: CI Reviewers: Subscribers: Tasks: Tags:

jerryzh168 force-pushed the ir branch from 8195db8 to c967f1d Compare October 16, 2023 20:07

/

b9d09f3

jerryzh168 requested review from ezyang, andrewor14, SherlockNoMad and kimishpatel October 16, 2023 20:09

svekars reviewed Oct 16, 2023

View reviewed changes

andrewor14 approved these changes Oct 18, 2023

View reviewed changes

/

29becc9

jerryzh168 force-pushed the ir branch from 2b83dc2 to 29becc9 Compare October 18, 2023 22:22

kimishpatel reviewed Oct 19, 2023

View reviewed changes

prototype_source/pt2e_quantizer.rst Outdated Show resolved Hide resolved

kimishpatel reviewed Oct 19, 2023

View reviewed changes

prototype_source/pt2e_quantizer.rst Show resolved Hide resolved

kimishpatel reviewed Oct 19, 2023

View reviewed changes

prototype_source/pt2e_quantizer.rst Outdated Show resolved Hide resolved

kimishpatel reviewed Oct 19, 2023

View reviewed changes

kimishpatel suggested changes Oct 19, 2023

View reviewed changes

ezyang reviewed Oct 19, 2023

View reviewed changes

prototype_source/pt2e_quantizer.rst Show resolved Hide resolved

kimishpatel approved these changes Oct 20, 2023

View reviewed changes

addressed comments

c86257a

leslie-fang-intel reviewed Oct 21, 2023

View reviewed changes

svekars approved these changes Oct 23, 2023

View reviewed changes

Merge branch 'main' into ir

970227a

svekars merged commit ce27577 into main Oct 23, 2023

svekars deleted the ir branch October 23, 2023 18:52

leslie-fang-intel mentioned this pull request Nov 28, 2023

[Quant] [PT2] Fix an issue in Conv Binary Quantization Annotation pytorch/pytorch#114540

Closed


		If people are uncertain if some functional operator is going to be traced into multiple aten operators then they can pick option 2.

		We would not recommend option 1. since that's the least stable in all three options.


		With this, the ``Quantizer`` will still be valid even when the implementation for nn modules and functionals changes, the ``aten`` IR for floating point model will change, but since we capture the pattern again instead of hardcoding the ``aten`` IR for the pattern, we'll get the updated ``aten`` IR as well and will still be able to match the pattern.

		One caveat is that if inputs of the pattern has multiple users, we don't have a good way to identify which user node we want to annotate except for checking the aten op target.

[quant] Add IR related recommendations for Quantizer tutorial #2608

[quant] Add IR related recommendations for Quantizer tutorial #2608

Uh oh!

Conversation

jerryzh168 commented Oct 16, 2023

Description

Checklist

Uh oh!

pytorch-bot bot commented Oct 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2608

✅ No Failures

Uh oh!

svekars left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

svekars commented Oct 18, 2023

Uh oh!

svekars commented Oct 18, 2023

Uh oh!

andrewor14 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 commented Oct 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Oct 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Oct 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kimishpatel Oct 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kimishpatel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ezyang commented Oct 20, 2023

Uh oh!

pytorch-bot bot commented Oct 16, 2023 •

edited

Loading

jerryzh168 commented Oct 18, 2023 •

edited

Loading

jerryzh168 Oct 19, 2023 •

edited

Loading

jerryzh168 Oct 19, 2023 •

edited

Loading

kimishpatel Oct 20, 2023 •

edited

Loading

jerryzh168 commented Oct 20, 2023 •

edited

Loading