Skip to content

Optimize levenshtein a bit for memory usage #13830

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 2, 2024

Conversation

nielsdos
Copy link
Member

@nielsdos nielsdos commented Mar 28, 2024

When all costs are equal, levenshtein fulfills the requirements of being a metric. A metric is symmetric, so we can swap the strings in that case. Since we use rows of a partial matrix of length |string2| we can make the choice of using string1 instead if |string1| < |string2|, which will optimize memory usage and CPU time.

Sample bench script:

<?php
for ($i = 0;$i<2000;$i++)
    levenshtein(str_repeat('a', 100), str_repeat('b', 1000), 1, 1, 1);

results:

Benchmark 1: ./sapi/cli/php x.php 
  Time (mean ± σ):     254.5 ms ±   1.7 ms    [User: 251.5 ms, System: 2.2 ms]
  Range (min … max):   252.5 ms … 257.4 ms    11 runs
 
Benchmark 2: ./sapi/cli/php_old x.php
  Time (mean ± σ):     292.0 ms ±   2.7 ms    [User: 289.8 ms, System: 1.8 ms]
  Range (min … max):   287.4 ms … 295.6 ms    10 runs
 
Summary
  ./sapi/cli/php x.php  ran
    1.15 ± 0.01 times faster than ./sapi/cli/php_old x.php

When all costs are equal, levenshtein fulfills the requirements
of being a metric. A metric is symmetric, so we can swap the strings in
that case. Since we use rows of a partial matrix of length |string2| we
can make the choice of using string1 instead if |string1| < |string2|,
which will optimize memory usage and CPU time.
@nielsdos nielsdos force-pushed the optimize-levenshtein branch from fc04ae6 to 6e190f0 Compare March 28, 2024 22:22
Copy link
Member

@bukka bukka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't thought too much about this but it makes sense to me.

@nielsdos nielsdos merged commit 04e0d80 into php:master Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants