-
Notifications
You must be signed in to change notification settings - Fork 7.9k
hash: Support custom algo parameters #6400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Poke on this one. Thanks. |
@m6w6 ping :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While all those FooInitArgs
shenanigans provide API compatibility they do not so for the ABI, but I think this is how we always did for X.y releases?
I wonder, though, we could have cut down the number of "provided" hash algos into half, it there was a "output bits" parameter for all those Foo<NBITS>
algos..., couldn't we? Maybe something to deprecate and provision in the next major...
LGTM, all in all.
Thanks for the reviews, i'll be working to address this. Regarding reducing the number of hashes - yep, it would've allowed to have say SHA, and pass the 1/256/512/etc. Other arbitrary nice things can be done with the custom args mechanism. Like, fe, the byteorder of the final hash could be reversed, etc. And of course, the necessary parts like passing a starting seed, will be made now possible to do. Thanks. |
I think this one is ready to be revaluated now. Thansk. |
7f4f324
to
579fd5e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This review is one-track :) It doesn't seem necessary to store the args
in the hashcontext object; instead they should be used only in hash context initialization. I scanned through and see no reason to store args
, even for cloning a hashcontext. There are benefits to concentrating all hash-algo-specific information in the hashcontext, and to keeping hashcontexts simpler.
Great point! I agree that leaving storage of arguments up to the hash implementation is better. I expect that in most cases no explicit storage will be needed, as is the case for murmur, where the extra args just affect how the state is initialized. |
🤦 of course, this makes a lot of sense! |
OK, lets get it in by that. Easier for me as i don't have to touch that test for now. I'm still sceptic as it's inconsistent and once it's out in stable, that inconsistency is engraved on stone. Still it's master, so it'll be clarified better when more hashes with custom args flow in. Thanks. |
579fd5e
to
e62e9f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. Only thing I would request is to rename $args
-> $options
in the public API, as that matches how we call such arguments elsewhere.
The concrete need on this change is to support passing an initial seed to the murmur hash. Passing a custom seed is important in terms of randomizing the hash function. The suggested implementation adds a HashTable parameter to all the init callbacks. Further on, an array with custom arguments is accepted from `hash` or `hash_init` from the user land. Currently several things like `hash_hkdf` are not touched, as they don't need passing custom args. Some convenience macros have been added to the SHA/MD families of functions, so the consuming code doesn't have to be changed widely. Special note on what has happened to `ext/hash/tests/hash_serialize_003.phpt` - the test gathers serialization strings produced on different platforms. As the implementation requires adding a new property to the hash context object, it changes the serialization string. However, it doesn't seem practical to repeat all the serialization strings again. The test is changed so it tests that the string can be unserialized on the same platform it was produced on at the runtime. Another way to implement this is to add another type of the init that would accept a HT with arguments. However, that would still require touching all the context structs in all the algos. That would also increase the size of those structs. As an init function is called just once, the way of modifying the existing init callback has been seen preferrable. Signed-off-by: Anatol Belski <ab@php.net>
Signed-off-by: Anatol Belski <ab@php.net>
Signed-off-by: Anatol Belski <ab@php.net>
- Deref passed seed arg - Don't dup args array Signed-off-by: Anatol Belski <ab@php.net>
Signed-off-by: Anatol Belski <ab@php.net>
Signed-off-by: Anatol Belski <ab@php.net>
Signed-off-by: Anatol Belski <ab@php.net>
de3c688
to
1271bff
Compare
As discussed, that's legacy and should not be touched. Signed-off-by: Anatol Belski <ab@php.net>
Signed-off-by: Anatol Belski <ab@php.net>
Signed-off-by: Anatol Belski <ab@php.net>
Co-authored-by: Nikita Popov <nikita.ppv@googlemail.com>
Co-authored-by: Nikita Popov <nikita.ppv@googlemail.com>
…`hash*()` functions > Hash > > The following functions `hash()`, `hash_file()`, and `hash_init()` now support an additional optional `options` argument, which can be used to pass algorithm specific data. Includes unit tests. Refs: * https://www.php.net/manual/en/migration81.new-features.php#migration81.new-features.hash * php/php-src#6400 * php/php-src@110b4e9
The concrete need on this change is to support passing an initial seed
to the murmur hash. Passing a custom seed is important in terms of
randomizing the hash function.
The suggested implementation adds a HashTable parameter to all the
init callbacks. Further on, an array with custom arguments is accepted
from
hash
orhash_init
from the user land. Currently several thingslike
hash_hkdf
are not touched, as they don't need passing customargs.
Some convenience macros have been added to the SHA/MD families of
functions, so the consuming code doesn't have to be changed widely.
Special note on what has happened to
ext/hash/tests/hash_serialize_003.phpt
- the test gathersserialization strings produced on different platforms. As the
implementation requires adding a new property to the hash context
object, it changes the serialization string. However, it doesn't seem
practical to repeat all the serialization strings again. The test is
changed so it tests that the string can be unserialized on the same
platform it was produced on at the runtime.
Another way to implement this is to add another type of the init that
would accept a HT with arguments. However, that would still require
touching all the context structs in all the algos. That would also
increase the size of those structs. As an init function is called just
once, the way of modifying the existing init callback has been seen
preferrable.
Signed-off-by: Anatol Belski ab@php.net