-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Fix the BOLT problem with GCC10. #8495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
BOLT requires adding GCC option "-fno-reorder-blocks-and-partition". In some codes, it will be omitted because "GCC push_options". This patch is to handle this case correctly.
Thank you for this pull request. This would need a benchmark comparing the speed before, after-without-bold, and after-with-bolt so that this could be merged. https://github.com/kocsismate/php-version-benchmarks could be used for that. |
@@ -2934,6 +2934,7 @@ enum XXH_VECTOR_TYPE /* fake enum */ { | |||
&& defined(__OPTIMIZE__) && !defined(__OPTIMIZE_SIZE__) /* respect -O0 and -Os */ | |||
# pragma GCC push_options | |||
# pragma GCC optimize("-O2") | |||
# pragma GCC optimize("no-reorder-blocks-and-partition") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all of these changes actually needed, or only this one (because it overrides with -O2
)? If it's only this one that's really necessary, I think we can just drop the GCC optimize usage here entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Seems -O2 is not necessary here, we should remove this logic.
Thanks for the comment. I tried to run the php-version-benchmarks with PHP8.1, but got 2 problems:
Could you tell me how to run php-version-benchmarks with more loadings? From the README I only find run command : |
I'll update the benchmark to use GCC 10 both in Docker and on AWS. Then I'll benchmark this branch on AWS against the latest master commit your branch is based on (pingzhaozz@829f297). |
I was not able to observe any change when The results are available at https://gist.github.com/kocsismate/58853e9ab0faaebb6112287d880170ea
Why load is important here? The aim of my benchmark suite is to reveal performance changes even if they are subtle, but this is not suitable for load testing. |
Add The detail BOLT step is as below: Load test is to find out the hot codes which to do the BOLT later. |
Is there any comments from review? |
@pingzhaozz as per #8495 (comment), does it work if you remove all the |
@arnaud-lb |
But does it work if you don't add |
It's not. BOLT requires GCC option |
@@ -4,6 +4,7 @@ | |||
# pragma GCC push_options | |||
# pragma GCC optimize("no-gcse") | |||
# pragma GCC optimize("no-ivopts") | |||
# pragma GCC optimize("no-reorder-blocks-and-partition") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The flag -fno-reorder-blocks-and-partition
should be passed to CC through CFLAGS
in Makefile.
- Using this line here won't disable this optimization for other files.
- This disables useful optimization for usual PHP builds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible to change PHP build system to optionally add -fno-reorder-blocks-and-partition
flag. It's also possible to pass it through CC
or CFLAGS
environment variables without any changes.
Because of this line:
|
This line shouldn't enable Anyway, we can't disable optimizations and make PHP slower to make BOLT work. In worst case, this new |
Let me go back to the beginning what happened when using BOLT. BOLT requires When I tried to use BOLT in PHP, I met the problem that some functions(such as those changed in this PR) doesn't work because of
which would overwrite the I agree that |
This doesn't talk anything about pragmas. I still think that something is wrong... It should be necessary only in ext/hash/xxhash/xxhash.h |
The link is the discussion about BOLT requires Since BOLT detects the hot code and move them, it depends on your workloads which functions will be hot. So it could be only xxhash.h in your case and like crc32.c in other cases. |
In ext/hash/xxhash/xxhash.h you can see |
…e it's for BOLT only. You can use EXTRA_CFLAGS="-DHAVE_BOLT ..." make -j to enable this. Signed-off-by: Ping Zhao <ping.zhao@intel.com>
Other files will use default GCC option settings after I updated code with |
Did you test it? There is a merge conflict... I've tried to test BOLT...
Segmentation fault (core dumped)
What did I do wrong? |
I think this workload is too small for perf to record enough data. I record perf data at least for 30s with
I checked both PHP 7.4.29 which I found this issue and PHP 8.1.4 which I submitted this PR, both works after change. I didn't find any tag for 8.2, could you tell me how to pull a 8.2 taged version? |
A quick check on PHP |
I understood my mistake. I run Now I followed your build instructions and it seems work out of the box (without this patch). I'm not sure if your patch is necessary now. What kind of error(s) do you see without the patch? |
The BOLTed binary will crash. |
BOLT method can bring PHP 3% and even more performance improvement. While BOLT requires CFLAGS ‘-fno-reorder-blocks-and-partition’ from GCC9. More information can be found at following link:
https://github.com/facebookincubator/BOLT
facebookarchive/BOLT#288
In GCC7, ‘reorder-blocks-and-partition’ is disable by default. So there’s no problem by default with GCC7.
In GCC10, ‘reorder-blocks-and-partition’ is enabled by default. We need set CFLAGS ‘
-fno-reorder-blocks-and-partition
’ when using BOLT. But there’s still a bug in code which ignore the CFLAGS with ‘GCC push_options
’ like below example:This patch is to fix this issue.