You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Change realpath cache hash algorithm to the regular string hash algorithm
Right now the FNV-1 algorithm is used for determine the realpath cache
key. For applications that are light-weight, but have lots of files
(e.g. WordPress), the realpath cache key computation shows up in the
Callgrind profile. The reason is that we do a simple byte-by-byte loop.
Furthermore, we always use the 32-bit prime and offset values, even in a
64-bit environment which reduces the diffusion property of the hash.
This hinders the distribution of keys a bit (although probably not a lot
since we have only limited entries in the cache).
I propose to switch to our regular string hashing algorithm, which is
better optimised than a byte-per-byte loop, and has better diffusion on
64-bit systems.
I don't know why FNV-1 was chosen over the DJB33X algorithm we use in the
normal string hashing. Also, I don't know why FNV-1A wasn't chosen
instead of FNV-1, which would be a simple modification and would
distribute the hashes better than FNV-1.
The only thing I can think of is that typically FNV-1A has a better
distribution than DJB33X algorithms like what we use for string hashing
[1]. But I doubt that makes a difference here, and if it does then we
should perhaps look into changing the string hash algorithm from DJB33X to
FNV-1A.
[1] https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed
0 commit comments