-
Notifications
You must be signed in to change notification settings - Fork 4.2k
add ITT recipe #2072
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add ITT recipe #2072
Conversation
✅ Deploy Preview for pytorch-tutorials-preview ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some editorial suggestions.
recipes_source/profile_with_itt.rst
Outdated
Requirements | ||
------------ | ||
|
||
* PyTorch 1.13+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* PyTorch 1.13+ | |
* PyTorch v1.13 or later |
recipes_source/profile_with_itt.rst
Outdated
|
||
In this recipe, you will learn: | ||
|
||
* An overview of Intel® VTune™ Profiler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* An overview of Intel® VTune™ Profiler | |
* What is Intel® VTune™ Profiler |
recipes_source/profile_with_itt.rst
Outdated
In this recipe, you will learn: | ||
|
||
* An overview of Intel® VTune™ Profiler | ||
* An overview of the Instrumentation and Tracing Technology (ITT) API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* An overview of the Instrumentation and Tracing Technology (ITT) API | |
* What is the Instrumentation and Tracing Technology (ITT) API |
recipes_source/profile_with_itt.rst
Outdated
* PyTorch 1.13+ | ||
* Intel® VTune™ Profiler | ||
|
||
The instructions for installing PyTorch are available at `pytorch.org <https://pytorch.org/>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The instructions for installing PyTorch are available at `pytorch.org <https://pytorch.org/>`_. | |
The instructions for installing PyTorch are available at `pytorch.org <https://pytorch.org/get-started/locally/>`__. |
recipes_source/profile_with_itt.rst
Outdated
|
||
For those who are familiar with Intel Architecture, Intel® VTune™ Profiler provides a rich set of metrics to help users understand how the application executed on Intel platforms, and thus have an idea where the performance bottleneck is. | ||
|
||
More detailed information, including getting started guide, are available `here <https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More detailed information, including getting started guide, are available `here <https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html>`_. | |
More detailed information, including a Getting Started guide, are available `on the Intel website <https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html>`__. |
recipes_source/profile_with_itt.rst
Outdated
|
||
Right side of the windows is split into 3 parts: `WHERE` (top left), `WHAT` (bottom left), and `HOW` (right). With `WHERE`, you can assign a machine where you want to run the profiling on. With `WHAT`, you can set path of the application that you want to profile. To profile a PyTorch script, it is recommended to wrap all manual steps, including activate a conda environment and setting required environment variable, into a bash script, then profile this bash script. In the screenshot above, we wrapped all steps into the `launch.sh` bash script and profile `bash` with parameter to be `<path_of_launch.sh>`. In the right side `HOW`, you can choose whatever type that you would like to profile. Details can be found at `Intel® VTune™ Profiler user guide <https://www.intel.com/content/www/us/en/develop/documentation/vtune-help/top/analyze-performance.html>`_. | ||
|
||
With a successful profiling with ITT, you can open `Platform` tab of the profiling result to see labels in Intel® VTune™ Profiler timeline. All operators starting with `aten::` are operators labeled implicitly by the ITT feature in PyTorch. Labels `iteration_N` are explicitly labeled with specific APIs `torch.profiler.itt.range_push()`, `torch.profiler.itt.range_pop()` or `torch.profiler.itt.range()` scope. Please check the sample code in next section for details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With a successful profiling with ITT, you can open `Platform` tab of the profiling result to see labels in Intel® VTune™ Profiler timeline. All operators starting with `aten::` are operators labeled implicitly by the ITT feature in PyTorch. Labels `iteration_N` are explicitly labeled with specific APIs `torch.profiler.itt.range_push()`, `torch.profiler.itt.range_pop()` or `torch.profiler.itt.range()` scope. Please check the sample code in next section for details. | |
With a successful profiling with ITT, you can open `Platform` tab of the profiling result to see labels in the Intel® VTune™ Profiler timeline. All operators starting with `aten::` are operators labeled implicitly by the ITT feature in PyTorch. Labels `iteration_N` are explicitly labeled with specific APIs `torch.profiler.itt.range_push()`, `torch.profiler.itt.range_pop()` or `torch.profiler.itt.range()` scope. Please check the sample code in the next section for details. |
recipes_source/profile_with_itt.rst
Outdated
A short sample code showcasing how to use PyTorch ITT APIs | ||
---------------------------------------------------------- | ||
|
||
Sample code below is the script that was used for profiling in the screenshots above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sample code below is the script that was used for profiling in the screenshots above. | |
The sample code below is the script that was used for profiling in the screenshots above. |
recipes_source/profile_with_itt.rst
Outdated
|
||
Sample code below is the script that was used for profiling in the screenshots above. | ||
|
||
The topology is formed by 2 operators, `Conv2d` and `Linear`. Three iterations of inference were performed. Each iteration was labled by PyTorch ITT APIs as text string `iteration_N`. Either pair of `torch.profile.itt.range_push` and `torch.profile.itt.range_pop` or `torch.profile.itt.range` scope does the customized labeling feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The topology is formed by 2 operators, `Conv2d` and `Linear`. Three iterations of inference were performed. Each iteration was labled by PyTorch ITT APIs as text string `iteration_N`. Either pair of `torch.profile.itt.range_push` and `torch.profile.itt.range_pop` or `torch.profile.itt.range` scope does the customized labeling feature. | |
The topology is formed by two operators, `Conv2d` and `Linear`. Three iterations of inference were performed. Each iteration was labeled by PyTorch ITT APIs as text string `iteration_N`. Either pair of `torch.profile.itt.range_push` and `torch.profile.itt.range_pop` or `torch.profile.itt.range` scope does the customized labeling feature. |
recipes_source/profile_with_itt.rst
Outdated
|
||
#!/bin/bash | ||
|
||
# Retrive the directory path where contains both the sample.py and launch.sh so that this script can be invoked from any directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Retrive the directory path where contains both the sample.py and launch.sh so that this script can be invoked from any directory | |
# Retrive the directory path where the path contains both the sample.py and launch.sh so that this script can be invoked from any directory |
Done amendments |
@chauhang can you please assign someone to review this PR? Thanks! |
recipes_source/profile_with_itt.rst
Outdated
To enable this feature, codes which are expected to be labeled should be invoked under a `torch.autograd.profiler.emit_itt()` scope. For example: | ||
|
||
.. code:: python3 | ||
|
||
with torch.autograd.profiler.emit_itt(): | ||
<codes...> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
English is not my native, but codes
does not sound like a correct plural for code
. Perhaps it can be replaced with something like code to be profiled
To enable this feature, codes which are expected to be labeled should be invoked under a `torch.autograd.profiler.emit_itt()` scope. For example: | |
.. code:: python3 | |
with torch.autograd.profiler.emit_itt(): | |
<codes...> | |
To enable this feature, code block, which is expected to be labeled should be invoked within `torch.autograd.profiler.emit_itt()` scope. For example: | |
.. code:: python3 | |
with torch.autograd.profiler.emit_itt(): | |
<code-to-be-profiled...> | |
recipes_source/profile_with_itt.rst
Outdated
The `launch.sh` bash script to wrap all manual steps is shown below. | ||
|
||
.. code:: bash | ||
|
||
# launch.sh | ||
|
||
#!/bin/bash | ||
|
||
# Retrive the directory path where the path contains both the sample.py and launch.sh so that this script can be invoked from any directory | ||
BASEFOLDER=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) | ||
source ~/miniconda3/bin/activate | ||
conda activate ipex_py38 | ||
cd ${BASEFOLDER} | ||
python sample.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how this section is specific to ITT? User does not need to install (nor use) conda to benefit from ITT feature. Nor should it be bound to python-3.8 , as environment name suggests. And since example does not have any data dependencies, changing working directory to base folder is superfluous as well, isn't it?
But paragraph about ITT filename and folder where it will be generated are probably relevant, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To use VTune to profile a python script, it is always recommended to wrap the execution of the python script into a bash script. Because executing a python script requires the configuration of a Python environment, which is not always there by default. Also, launching the script from vtune could be done in a directory that doesn't have the python script, so it is better to switch the active directory into the desired folder inside the bash script execution.
Detailed explanation is there in the section above:
To profile a PyTorch script, it is recommended to wrap all manual steps, including activating a Python environment and setting required environment variables, into a bash script, then profile this bash script. In the screenshot above, we wrapped all steps into the launch.sh bash script and profile bash with the parameter to be <path_of_launch.sh>.
I restated the description to make it more descriptive and related to vtune. Also I changed the specific python version and conda distribution into a generic description. Does the updated content below look good to you?
The `launch.sh` bash script, mentioned in the Intel® VTune™ Profiler GUI screenshot, to wrap all manual steps is shown below.
.. code:: bash
# launch.sh
#!/bin/bash
# Retrive the directory path where the path contains both the sample.py and launch.sh so that this bash script can be invoked from any directory
BASEFOLDER=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
<Activate a Python environment>
cd ${BASEFOLDER}
python sample.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, with two minor suggestions:
- Get rid of
<codes>
- Explain better/get rid of starter script
@jingxu10 we are trying to get a review from the partners team. Stay tuned. |
|
||
With a successful profiling with ITT, you can open `Platform` tab of the profiling result to see labels in the Intel® VTune™ Profiler timeline. All operators starting with `aten::` are operators labeled implicitly by the ITT feature in PyTorch. Labels `iteration_N` are explicitly labeled with specific APIs `torch.profiler.itt.range_push()`, `torch.profiler.itt.range_pop()` or `torch.profiler.itt.range()` scope. Please check the sample code in the next section for details. | ||
|
||
.. figure:: /_static/img/itt_tutorial/vtune_timeline.png |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please elaborate a bit more on how to read this picture? What do the big brown boxes mean? Or what does someone learn once they look at this picture?
A big convolution gets broken into multiple smaller ones on different thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jingxu10 this is a great addition to the recipes.
recipes_source/profile_with_itt.rst
Outdated
Requirements | ||
------------ | ||
|
||
* PyTorch v1.13 or later |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit : PyTorch 1.13 or later
recipes_source/profile_with_itt.rst
Outdated
:width: 100% | ||
:align: center | ||
|
||
Three sample results are available in the left side navigation bar under `sample (matrix)` project. If you do not want profiling results appear in this default sample project, you can create a new project via the button `New Project...` under the blue `Configure Analysis...` button. To start a new profiling, click the blue `Configure Analysis...` button to initiate configuration of the profiling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typos : on the left side
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are the three dots after "New project..." and " configure Analysis..." intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the three dots are shown in the VTune GUI buttons, so I copied them.
recipes_source/profile_with_itt.rst
Outdated
2. Explicit invocation: If customized labeling is needed, users can use APIs mentioned at `PyTorch Docs <https://pytorch.org/docs/stable/profiler.html#intel-instrumentation-and-tracing-technology-apis>`__ explicitly to label a desired range. | ||
|
||
|
||
To enable this feature, codes which are expected to be labeled should be invoked under a `torch.autograd.profiler.emit_itt()` scope. For example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does "this" here refers to the explicit invocation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this refers to the explicit invocation. I'll make it clear in the tutorial.
recipes_source/profile_with_itt.rst
Outdated
:width: 100% | ||
:align: center | ||
|
||
The right side of the windows is split into 3 parts: `WHERE` (top left), `WHAT` (bottom left), and `HOW` (right). With `WHERE`, you can assign a machine where you want to run the profiling on. With `WHAT`, you can set the path of the application that you want to profile. To profile a PyTorch script, it is recommended to wrap all manual steps, including activating a Python environment and setting required environment variables, into a bash script, then profile this bash script. In the screenshot above, we wrapped all steps into the `launch.sh` bash script and profile `bash` with the parameter to be `<path_of_launch.sh>`. In the right side `HOW`, you can choose whatever type that you would like to profile. Details can be found at `Intel® VTune™ Profiler User Guide <https://www.intel.com/content/www/us/en/develop/documentation/vtune-help/top/analyze-performance.html>`__. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typos : In the right side `HOW --> on the right side 'How'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please clarify on " you can choose whatever type that you would like to profile."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel VTune Profiler provides a bunch of profiling types (https://www.intel.com/content/www/us/en/develop/documentation/vtune-help/top/analyze-performance.html) that users can choose from.
recipes_source/profile_with_itt.rst
Outdated
:width: 100% | ||
:align: center | ||
|
||
The right side of the windows is split into 3 parts: `WHERE` (top left), `WHAT` (bottom left), and `HOW` (right). With `WHERE`, you can assign a machine where you want to run the profiling on. With `WHAT`, you can set the path of the application that you want to profile. To profile a PyTorch script, it is recommended to wrap all manual steps, including activating a Python environment and setting required environment variables, into a bash script, then profile this bash script. In the screenshot above, we wrapped all steps into the `launch.sh` bash script and profile `bash` with the parameter to be `<path_of_launch.sh>`. In the right side `HOW`, you can choose whatever type that you would like to profile. Details can be found at `Intel® VTune™ Profiler User Guide <https://www.intel.com/content/www/us/en/develop/documentation/vtune-help/top/analyze-performance.html>`__. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how the workflow works, is this correct?
- Add context manager "with torch.autograd.profiler.emit_itt():"
- Run the code --> it saved the traces somewhere
- load the traces in the profiler?
Or
run the profiler, then WithWHAT
, How and Where, you can set the path of the application that you want to profile, it automatically capture the traces and show you. Assume, adding the context manager can be additional to what it does for labeling a part not the only way to capture the profiles.
This statement "you can set the path of the application that you want to profile" suggest the second path/workflow.
Can you please clarify the workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the second way. Users launch VTune first, then launch the application from vtune. WHAT
in VTune is the place to let vtune know which application to launch and profile.
The second image, with descriptions below it, introduces how to use Intel VTune Profiler to profile a PyTorch script. I'll shorten the starting 2 |
Hi @msaroufim @HamidShojanazeri , would you please review the updated content? Do they align with your expectation? |
recipes_source/profile_with_itt.rst
Outdated
2. Explicit invocation: If customized labeling is needed, users can use APIs mentioned at `PyTorch Docs <https://pytorch.org/docs/stable/profiler.html#intel-instrumentation-and-tracing-technology-apis>`__ explicitly to label a desired range. | ||
|
||
|
||
To enable explicit invocation, codes which are expected to be labeled should be invoked under a `torch.autograd.profiler.emit_itt()` scope. For example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You still have codes
here
Again, English is not my native, but please compare results of https://www.bing.com/search?q=codes vs https://www.bing.com/search?q=code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, your concern is on the plural. I'm not a native speaker as well. Changed it to code
anyway. Done
recipes_source/profile_with_itt.rst
Outdated
|
||
#!/bin/bash | ||
|
||
# Retrive the directory path where the path contains both the sample.py and launch.sh so that this bash script can be invoked from any directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jingxu10 is it Retrieve the directory path
here? Might be a typo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Corrected.
If the updates look good to you, please help to merge it. |
recipes_source/profile_with_itt.rst
Outdated
|
||
As illustrated on the right side navigation bar, brown portions in the timeline rows show CPU usage of individual threads. The percerntage of height of a thread row that the brown portion occupies at a timestamp aligns with that of the CPU usage in that thread at that timestamp. Thus, it is intuitive from this timeline to understand the followings: | ||
|
||
1. How well CPU cores are utlized on each thread. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jingxu10, utlized -> utilized ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Hi, may I know status? Are we waiting for more approvals or it is ok to merge? Please feel free to let me know if there are further comments. |
@jingxu10 I don't think there are any more comments left and the PR looks good. However, we typically merge tutorials for the new release a couple days before the release. There are no action items from you as of now and I will keep you posted. |
Got it. |
Add tutorial for ITT feature at pytorch/pytorch#63289