Skip to content

[quant] Add IR related recommendations for Quantizer tutorial #2608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Oct 23, 2023
Merged

Conversation

jerryzh168
Copy link
Contributor

Summary:
att

Test Plan:
CI

Reviewers:

Subscribers:

Tasks:

Tags:

Fixes #ISSUE_NUMBER

Description

Checklist

  • The issue that is being fixed is referred in the description (see above "Fixes #ISSUE_NUMBER")
  • Only one issue is addressed in this pull request
  • Labels from the issue that this PR is fixing are added to this pull request
  • No unnecessary issues are included into this pull request.

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 16, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2608

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 970227a with merge base 4e25f97 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary:
att

Test Plan:
CI

Reviewers:

Subscribers:

Tasks:

Tags:
Copy link
Contributor

@svekars svekars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some editorial nits.

@svekars
Copy link
Contributor

svekars commented Oct 18, 2023

Can you please fix the pyspelling job?

@svekars
Copy link
Contributor

svekars commented Oct 18, 2023

Can someone please take a look? @ezyang @andrewor14 @SherlockNoMad @kimishpatel

Copy link
Contributor

@andrewor14 andrewor14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just a few questions about the ordering / how this relates to our existing XNNPACKQuantizer.

conv = torch.nn.functional.conv2d(x, weight, bias)
relu = torch.nn.functional.relu(conv)
# returns an additional dict that includes a map from name to node that we want to annotate
return relu, {"conv": conv, "relu": relu, "x": x, "weight": weight, "bias": bias}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we resolve the weight issue? Can we return an nn.Parameter here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not yet, nn.Parameter is not supported. also I haven't tested if this would work for linear modules yet actually.

# annotate the nodes
inputs, output = _find_input_and_output(match)
inputs[0].users[0].meta["quantization_annotation"] = ...
inputs[1].users[0].meta["quantization_annotation"] = ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually from reading the example (2) doesn't seem super user friendly. Should we put (3) SubgraphMatcherWithNameNodeMap at the top as the recommended way? I feel it'll handle 90% of the patterns we want to match. For something like LSTM or MHA maybe we need something more advanced like (1) and (2)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I was thinking of having (3) as the recommended one, but there is actually an caveat that we are assuming some ops not being traced into multiple ops (e.g. F.linear, F.conv2d etc.)

@jerryzh168
Copy link
Contributor Author

jerryzh168 commented Oct 18, 2023

Can you please fix the pyspelling job?

@svekars I'm not sure how to fix these actually: https://github.com/pytorch/tutorials/actions/runs/6539478971/job/17757612269?pr=2608


2. Using ``SubgraphMatcher``
--------------------------------
Because of this, we recommend people to recognize the pattern through ``SubgraphMatcher``, through capturing a ``torch`` IR pattern (with the same program capture used for capturing the floating point model), instead of using the ``aten`` IR pattern directly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

who are the "people" here?

We should really just suggest alternative, rather than recommendation. When we say that quantization targets pre-dispatch aten IR, that is our contract. So the fact that, in some cases, pre-dispatch aten IR is not same as torch IR is not really quantization workflow problem but part of the export. That is the reason I feel we should really provide "examples of pattern matching". I feel recommendation feels too strong, given all the drawbacks listed below.

Copy link
Contributor Author

@jerryzh168 jerryzh168 Oct 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"people" means backend developers.

The motivation for giving a recommendation is introduced in L338, basically trying to make sure the pattern matching is robust against pytorch code changes. If we don't give the recommendation I think modeling users will still expect the same python model + same quantization code + same quantizer will give the same quantized model when they update their pytorch version, which may not be true if quantizer writers are not cautious about this

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats fair. My main concern is with "recommendation". I would have presented alternatives with pros and cons.


2. Using ``SubgraphMatcher``
--------------------------------
Because of this, we recommend people to recognize the pattern through ``SubgraphMatcher``, through capturing a ``torch`` IR pattern (with the same program capture used for capturing the floating point model), instead of using the ``aten`` IR pattern directly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also is torch IR defined anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably not officially defined, L307 gives some examples


If people are uncertain if some functional operator is going to be traced into multiple aten operators then they can pick option 2.

We would not recommend option 1. since that's the least stable in all three options.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Against this is a very very storng statement which does not hold true in practice. If anything, in practice 1 is a better option because torch IR for the most part match pre-dispatch aten IR.

Copy link
Contributor Author

@jerryzh168 jerryzh168 Oct 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why option 1 would be better than torch IR? what is the criteria here? given the motivation in L338 I think option 1 is the least stable of all 3.

Copy link

@kimishpatel kimishpatel Oct 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i understand the stability argument. My point is that in practice torch IR is very close to aten IR. And if you dont recommend it, I would just remove option 1 and just have this doc be "how-to-pattern-match for quantizers"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll present this in a different way, this is also a motivation of why we want to have other ways of matching

Copy link

@kimishpatel kimishpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments

@ezyang
Copy link
Contributor

ezyang commented Oct 20, 2023

Modulo disagreements with @kimishpatel, the doc seems fine, but you should workshop it with one of the potential users and check if the APIs actually are sufficient for them in practice. User feedback matters.

@kimishpatel
Copy link

Modulo disagreements with @kimishpatel, the doc seems fine, but you should workshop it with one of the potential users and check if the APIs actually are sufficient for them in practice. User feedback matters.

lol. Please do tell me where you disagree though

@jerryzh168
Copy link
Contributor Author

jerryzh168 commented Oct 20, 2023

Modulo disagreements with @kimishpatel, the doc seems fine, but you should workshop it with one of the potential users and check if the APIs actually are sufficient for them in practice. User feedback matters.

lol. Please do tell me where you disagree though

I think Ed meant the disagreement between you and me on what options to recommend. I plan to restructure this and use option 1 as motivation and remove option 2, please let me know if there are other concerns


With this, the ``Quantizer`` will still be valid even when the implementation for nn modules and functionals changes, the ``aten`` IR for floating point model will change, but since we capture the pattern again instead of hardcoding the ``aten`` IR for the pattern, we'll get the updated ``aten`` IR as well and will still be able to match the pattern.

One caveat is that if inputs of the pattern has multiple users, we don't have a good way to identify which user node we want to annotate except for checking the aten op target.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I understand this issue clearly?

  • One caveat is that if inputs of the pattern has multiple users,
    Are we talking about the case that a input has been used by multiple different patterns? or the case that a input has been used by multiple nodes inside one pattern?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that's correct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants