Skip to content

Small change SIMD codes #18626

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

SakiTakamachi
Copy link
Member

@SakiTakamachi SakiTakamachi commented May 23, 2025

That convert the SSE2 API to NEON using zend_simd.h, enabling the use of SIMD in a NEON environment.

@SakiTakamachi SakiTakamachi changed the title Use zend_simd.h 2 Optimize SIMD codes May 23, 2025
@SakiTakamachi SakiTakamachi changed the title Optimize SIMD codes Small change SIMD codes May 23, 2025
@SakiTakamachi SakiTakamachi marked this pull request as ready for review May 23, 2025 10:25
@youkidearitai
Copy link
Contributor

@SakiTakamachi Thank you very much for update SIMD code. We need to confirm improve performance in aarch64 devices. I will try test on aarch64. Just a moment, please.

cc @alexdowad

@youkidearitai
Copy link
Contributor

Please check: #11076 that I was try on using NEON.

Hmm... seems slow down in UTF-8 validation on Raspberry Pi 4B.
Using my code: https://gist.github.com/youkidearitai/7cd8771f6f6e40e21708129707b40204

configure

#! /bin/sh
#
# Created by configure

CFLAGS='-fsanitize=undefined,address -DZEND_TRACK_ARENA_ALLOC' \
CC='clang-14' \
CXX='clang++-14' \
'./configure' \
'--enable-mbstring' \
'--prefix=/home/tekimen/tekimen-php' \
'--enable-debug' \
'--enable-intl' \
'--enable-werror' \
"$@"

CPU

$ sapi/cli/php long-utf-8-bench.php
bool(true)
time: 29.112022218
speed: 48.289328356269 MB/s

Using XSSE(This PR)

$ sapi/cli/php long-utf-8-bench.php
bool(true)
time: 104.213115168
speed: 13.489664882714 MB/s

@youkidearitai
Copy link
Contributor

Maybe my environment is slow that added --enable-debug so I am trying remove --enable-debug.

@SakiTakamachi
Copy link
Member Author

@youkidearitai

I used your benchmark code to run measurements on my M2 environment. It appears that the code utilizing NEON performs approximately 1.7 times better.

Please note that the M2 has very strong SIMD capabilities, so the results might be better than what you would typically see on a standard ARM machine.

./configure --disable-debug --disable-all --enable-mbstring
Benchmark 1: /php-dev2/sapi/cli/php /mount/mb/long-utf-8-bench.php
  Time (mean ± σ):     574.2 ms ±   3.8 ms    [User: 349.2 ms, System: 222.1 ms]
  Range (min … max):   570.4 ms … 582.9 ms    10 runs
 
Benchmark 2: /master/sapi/cli/php /mount/mb/long-utf-8-bench.php
  Time (mean ± σ):     982.7 ms ±   5.2 ms    [User: 751.6 ms, System: 228.4 ms]
  Range (min … max):   974.8 ms … 989.3 ms    10 runs
 
Summary
  '/php-dev2/sapi/cli/php /mount/mb/long-utf-8-bench.php' ran
    1.71 ± 0.01 times faster than '/master/sapi/cli/php /mount/mb/long-utf-8-bench.php'

@alexdowad
Copy link
Contributor

Thanks very much!
I hope to look at zend_simd.h before leaving comments on this PR. In the meantime, thanks very much for your work.

@youkidearitai
Copy link
Contributor

I confirmed improve performance when release build.

XSSE

$ sapi/cli/php long-utf-8-bench.php
bool(true)
time: 2.231191191
speed: 630.06702682881 MB/s

non-SIMD(master dfff6ac)

 $ sapi/cli/php long-utf-8-bench.php
bool(true)
time: 4.667661884
speed: 301.1786275306 MB/s

Seems 2x faster than non-SIMD on Raspberry Pi 4B.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants