Skip to content

Commit 928308e

Browse files
committed
[GR-58201] Document performance investigation (and some other docs updates)
PullRequest: graalpython/3484
2 parents 9026ef0 + f33058a commit 928308e

File tree

8 files changed

+118
-71
lines changed

8 files changed

+118
-71
lines changed

docs/contributor/CONTRIBUTING.md

Lines changed: 27 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -16,18 +16,34 @@ git clone https://github.com/graalvm/mx.git
1616
```
1717
Make sure to add the `mx` directory to your `PATH`.
1818

19-
You can always use the latest stable JDK for development.
20-
You can also download a suitable JDK using mx:
19+
Use `mx` to get additional projects at the right versions.
20+
From within your `graalpython` checkout, run:
21+
```
22+
mx sforceimport
23+
```
24+
25+
You can then download a suitable JDK:
2126
```bash
22-
mx fetch-jdk
27+
mx -p ../graal/vm --env ce-python fetch-jdk -A --jdk-id labsjdk-ce-latest
28+
```
29+
30+
Make sure that the `JAVA_HOME` environment variable is set:
31+
```bash
32+
export JAVA_HOME="${HOME}/.mx/jdks/labsjdk-ce-latest
33+
```
34+
35+
(Or on Windows)
36+
```
37+
$env:JAVA_HOME="$HOME\.mx\jdks\labsjdk-ce-latest"
2338
```
24-
Make sure that the `JAVA_HOME` environment variable is set.
2539
2640
For building GraalPy, you will also need some native build tools and libraries. On a Debian based system, install:
2741
```bash
2842
sudo apt install build-essential libc++-12-dev zlib1g-dev cmake
2943
```
3044
45+
(On Windows, make sure you are running in a Visual Studio Developer Powershell, that should have everything you need.)
46+
3147
Lastly, download maven, extract it and include it on your `PATH`.
3248
3349
Once you have all the necessary tools, you can run `mx python-jvm` in this repository.
@@ -36,7 +52,8 @@ If it succeeds without errors, you should already be able to run `mx python` and
3652
3753
For development, we recommend running `mx ideinit` next.
3854
This will generate configurations for Eclipse, IntelliJ, and NetBeans so that you can open the projects in these IDEs.
39-
If you use another editor with support for the [Eclipse language server](https://github.com/eclipse/eclipse.jdt.ls) we have also had reports of useable development setups with that, but it's not something we support.
55+
See also the documentation in mx for [setting up your IDE](https://github.com/graalvm/mx/blob/master/docs/IDE.md).
56+
If you use another editor (such as VSCode, Emacs, or Neovim) with support for the [Eclipse language server](https://github.com/eclipse/eclipse.jdt.ls) or [Apache NetBeans language server](https://marketplace.visualstudio.com/items?itemName=ASF.apache-netbeans-java), you can also get useable development setups with that, but it's not something we explicitly support.
4057
4158
## Development Layout
4259
@@ -110,6 +127,11 @@ If the IDE was initialized properly by using the command mentioned above, the ex
110127
111128
Both of these commands also work when you have a `graalpy` executable, e.g. inside a `venv`.
112129
130+
For debugging the C API and native extensions, first make sure you rebuild (`mx clean` first!) graalpything with the environment variable `CFLAGS=-g` set.
131+
This will keep debug symbols in our C API implementation which should allow you to use `gdb` or [`rr`](https://rr-project.org/) to debug.
132+
When you build an SVM image, debugging the entire application is possible, and there are [docs](https://www.graalvm.org/reference-manual/native-image/guides/debug-native-image-process/) to see Java code when inside the native debugger.
133+
Make sure you find and keep the `libpythonvm.so.debug` file around next to your GraalPy build, you can find it somewhere under `graal/sdk/mxbuild`.
134+
113135
## Advanced Commands to Develop and Debug
114136
115137
Here are some advanced commands to debug test failures and fix issues.
@@ -299,22 +321,3 @@ mx --env ../../graal/vm/mx.vm/ce \
299321
--jvm-config=native \
300322
--python-vm-config=default --
301323
```
302-
303-
## Finding Memory Leaks
304-
305-
For best performance we keep references to long-lived user objects (mostly functions, classes, and modules) directly in the AST nodes when using the default configuration of a single Python context (as is used when running the launcher).
306-
For better sharing of warm-up and where absolutely best peak performance is not needed, contexts can be configured with a shared engine and the ASTs will be shared across contexts.
307-
However, that implies we *must* not store any user objects strongly in the ASTs.
308-
We test that we have no PythonObjects alive after a Context is closed that are run as part of our JUnit tests.
309-
These can be run by themselves, for example, like so:
310-
311-
```bash
312-
mx python-leak-test --lang python \
313-
--shared-engine \
314-
--code 'import site, json' \
315-
--forbidden-class com.oracle.graal.python.builtins.objects.object.PythonObject \
316-
--keep-dump
317-
```
318-
319-
The `--keep-dump` option will print the heapdump location and leave the file there rather than deleting it.
320-
It can then be opened for example with VisualVM to check for the paths of any leaked object, if there are any.
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
# Investigating GraalPy Performance
2+
3+
First, make sure to build GraalPy with debug symbols.
4+
`export CFLAGS=-g` before doing a fresh `mx build` adds the debug symbols flags to all our C extension libraries.
5+
When you build a native image, use `find` to get the `.debug` file somewhere from the `mxbuild` directory tree, it's called something like `libpythonvm.so.debug`.
6+
Make sure to get that one and put it next to the `libpythonvm.so` in the Python standalone so that tools can pick it up.
7+
8+
## Peak Performance
9+
10+
[Truffle docs](https://www.graalvm.org/graalvm-as-a-platform/language-implementation-framework/Optimizing/) under graal/truffle/docs/Optimizing.md are a good starting point.
11+
They describe how to start with the profiler, especially useful is the [flamegraph](https://www.graalvm.org/graalvm-as-a-platform/language-implementation-framework/Profiling/#creating-a-flame-graph-from-cpu-sampler).
12+
This gives you a high-level idea of where time is spent.
13+
Note that currently (GR-58204) executions with native extensions may be less accurate.
14+
15+
In GraalPy's case the flamegraph is also useful to compare performance to CPython.
16+
[Py-spy](https://pypi.org/project/py-spy/) is pretty good for that, since it generates a flamegraph that is sufficiently comparable.
17+
Note that `py-spy` is a sampling profiler that accesses CPython internals, so it often does not work on the latest CPython, use a bit older one.
18+
19+
```
20+
py-spy record -n -r 100 -o pyspy.svg -- foo.py
21+
```
22+
23+
Once you have identified something that takes way too long on GraalPy as compared to CPython, follow the Truffle guide.
24+
25+
When you use [IGV](https://www.graalvm.org/tools/igv/), an interesting thing about debugging deoptimizations with IGV is that if you trace deopts as per the Truffle document linked above, search for "JVMCI: installed code name=".
26+
If the name ends with "#2" it's a second tier compilation.
27+
You might notice the presence of a `debugId` or `debug_id` in the output of these options.
28+
That id can be searched via `id=NUMBER`, `idx=NUMBER` or `debugId=NUMBER` in IGV's `Search in Nodes` search box, then selecting `Open Search for node NUMBER in Node Searches window`, and then clicking the `Search in following phases` button.
29+
Another useful thing to know is the `compile_id` matches the `compilationId` in IGVs "properties" view of the dumped graph.
30+
31+
[Proftool](https://github.com/graalvm/mx/blob/master/README-proftool.md) can also be helpful.
32+
Note that this is not really prepared for language launchers, if it doesn't work, just get the commandline and build the arguments manually.
33+
34+
## Interpreter Performance
35+
36+
For interpreter performance async profiler is good and also allows for some visualizations.
37+
Backtrace view and flat views are good.
38+
It is only for JVM executions (not native images).
39+
Download async-profiler and make sure you also have debug symbols in your C extensions.
40+
Use these options:
41+
42+
```
43+
--vm.agentpath:/path/to/async-profiler/lib/libasyncProfiler.so=start,event=cpu,file=profile.html' --vm.XX:+UnlockDiagnosticVMOptions --vm.XX:+DebugNonSafepoints
44+
```
45+
46+
Another very useful tool is [gprofng](https://blogs.oracle.com/linux/post/gprofng-the-next-generation-gnu-profiling-tool), it is part of binutils these days.
47+
If you have debug symbols, it works quite well with JVM launchers since it understands Hotspot frames, but also works fine with native images.
48+
You might run into a bug with our language launchers: https://sourceware.org/bugzilla/show_bug.cgi?id=32110 The patch in that bugreport from me (Tim) -- while not entirely correct and not passing their testsuite -- lets you review recorded profiles (the bug only manifests when viewing a recorded profile).
49+
What's nice about gprofng is that it can attribute time spent to Java bytecodes, so you can even profile huge methods like bytecode loops that, for example, the DSL has generated.
50+
51+
For SVM builds it is very useful to look at Truffle's [HostInlining](https://www.graalvm.org/graalvm-as-a-platform/language-implementation-framework/HostOptimization/) docs and check the debugging section there.
52+
This helps ensure that expected code is inlined (or not).
53+
When I identify something that takes long using gprofng, for example, I find it useful to check if that stuff is inlined as expected on SVM during the HostInliningPhase.
54+
55+
Supposedly Intel VTune and Oracle Developer Studio work well, but I haven't tried them.
56+
57+
## Memory Usage
58+
59+
Memory usage is best tracked with VisualVM for the Java heap.
60+
For best performance we keep references to long-lived user objects (mostly functions, classes, and modules) directly in the AST nodes when using the default configuration of a single Python context (as is used when running the launcher).
61+
For better sharing of warm-up and where absolutely best peak performance is not needed, contexts can be configured with a shared engine and the ASTs will be shared across contexts.
62+
However, that implies we *must* not store any user objects strongly in the ASTs.
63+
We test that we have no PythonObjects alive after a Context is closed that are run as part of our JUnit tests.
64+
These can be run by themselves, for example, like so:
65+
66+
```bash
67+
mx python-leak-test --lang python \
68+
--shared-engine \
69+
--code 'import site, json' \
70+
--forbidden-class com.oracle.graal.python.builtins.objects.object.PythonObject \
71+
--keep-dump
72+
```
73+
74+
The `--keep-dump` option will print the heapdump location and leave the file there rather than deleting it.
75+
It can then be opened for example with VisualVM to check for the paths of any leaked object, if there are any.
76+
77+
For native code, use native memory profiling tools.
78+
I have used [`massif`](https://valgrind.org/docs/manual/ms-manual.html) in the past to find allocations and memory issues in native extensions, but be aware of the large overhead.
79+
However, once you do find something interesting using `massif`, [`rr`](https://rr-project.org/) is a good option to dive further into it, because then you can break around places massif found allocations, and use memory breakpoints and reverse and forward execution to find where the memory is allocated and released.
80+
This can be useful to identify memory leaks in our C API emulation.

docs/contributor/MISSING.md

Lines changed: 0 additions & 35 deletions
This file was deleted.

graalpython/com.oracle.graal.python/src/com/oracle/graal/python/builtins/modules/SysModuleBuiltins.java

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -152,7 +152,6 @@
152152
import com.oracle.graal.python.builtins.PythonOS;
153153
import com.oracle.graal.python.builtins.modules.SysModuleBuiltinsClinicProviders.GetFrameNodeClinicProviderGen;
154154
import com.oracle.graal.python.builtins.modules.SysModuleBuiltinsClinicProviders.SetDlopenFlagsClinicProviderGen;
155-
import com.oracle.graal.python.builtins.modules.SysModuleBuiltinsFactory.ExcInfoNodeFactory;
156155
import com.oracle.graal.python.builtins.modules.io.BufferedReaderBuiltins;
157156
import com.oracle.graal.python.builtins.modules.io.BufferedWriterBuiltins;
158157
import com.oracle.graal.python.builtins.modules.io.FileIOBuiltins;
@@ -856,12 +855,6 @@ static PTuple run(VirtualFrame frame,
856855
return factory.createTuple(new Object[]{getClassNode.execute(inliningTarget, exceptionObject), exceptionObject, traceback});
857856
}
858857
}
859-
860-
@NeverDefault
861-
public static ExcInfoNode create() {
862-
return ExcInfoNodeFactory.create(null);
863-
}
864-
865858
}
866859

867860
// ATTENTION: this is intentionally a PythonBuiltinNode and not PythonUnaryBuiltinNode,

graalpython/com.oracle.graal.python/src/com/oracle/graal/python/builtins/modules/cext/PythonCextErrBuiltins.java

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,7 @@
7373
import com.oracle.graal.python.builtins.modules.PosixModuleBuiltins.ExitNode;
7474
import com.oracle.graal.python.builtins.modules.SysModuleBuiltins;
7575
import com.oracle.graal.python.builtins.modules.SysModuleBuiltins.ExcInfoNode;
76+
import com.oracle.graal.python.builtins.modules.SysModuleBuiltinsFactory.ExcInfoNodeFactory;
7677
import com.oracle.graal.python.builtins.modules.cext.PythonCextBuiltins.CApiBinaryBuiltinNode;
7778
import com.oracle.graal.python.builtins.modules.cext.PythonCextBuiltins.CApiBuiltin;
7879
import com.oracle.graal.python.builtins.modules.cext.PythonCextBuiltins.CApiNullaryBuiltinNode;
@@ -444,11 +445,15 @@ static Object write(Object msg, Object obj,
444445

445446
@CApiBuiltin(ret = Void, args = {Int}, call = Direct)
446447
abstract static class PyErr_PrintEx extends CApiUnaryBuiltinNode {
448+
static ExcInfoNode createExcInfoNode() {
449+
return ExcInfoNodeFactory.create(null);
450+
}
451+
447452
@TruffleBoundary
448453
@Specialization
449454
static Object raise(int set_sys_last_vars,
450455
@Cached IsInstanceNode isInstanceNode,
451-
@Cached ExcInfoNode excInfoNode,
456+
@Cached(neverDefault = true, value = "createExcInfoNode()") ExcInfoNode excInfoNode,
452457
@Cached PyErr_Restore restoreNode,
453458
@Cached PyFile_WriteObject writeFileNode,
454459
@Cached ExitNode exitNode,

graalpython/com.oracle.graal.python/src/com/oracle/graal/python/builtins/objects/cext/capi/PythonNativeWrapper.java

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -119,11 +119,11 @@ public final boolean isNative() {
119119
* transition code will consider that and eagerly return the pointer object. If {@code true} is
120120
* returned, the wrapper must also implement {@link #getReplacement(InteropLibrary)} which
121121
* returns the pointer object. Furthermore, wrappers must use
122-
* {@link #registerReplacement(Object, InteropLibrary)} to register the allocated native memory
123-
* in order that the native pointer can be resolved to the managed wrapper in the
122+
* {@link #registerReplacement(Object, boolean, InteropLibrary)} to register the allocated
123+
* native memory in order that the native pointer can be resolved to the managed wrapper in the
124124
* <it>native-to-Python</it> transition.
125125
* </p>
126-
*
126+
*
127127
* @return {@code true} if the wrapper should be materialized eagerly, {@code false} otherwise.
128128
*/
129129
public final boolean isReplacingWrapper() {

graalpython/com.oracle.graal.python/src/com/oracle/graal/python/nodes/argument/CreateArgumentsNode.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -235,7 +235,7 @@ public abstract static class CreateAndCheckArgumentsNode extends PNodeWithContex
235235
*
236236
* @param inliningTarget The inlining target.
237237
* @param callableOrName This object can either be the function/method object or just a name
238-
* ({@link TruffleString)}. It is primarily used to create error messages. It is
238+
* ({@link TruffleString}). It is primarily used to create error messages. It is
239239
* also used to check if the function
240240
* @param userArguments The positional arguments as provided by the caller (must not be
241241
* {@code null} but may be empty).
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
11
org.eclipse.jdt.core.compiler.problem.unusedParameter=ignore
2+
org.eclipse.jdt.core.compiler.problem.missingOverrideAnnotation=warning

0 commit comments

Comments
 (0)