Skip to content

Add hash_hkdf() to ext/hash #1105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Add hash_hkdf() to ext/hash #1105

wants to merge 4 commits into from

Conversation

narfbg
Copy link
Contributor

@narfbg narfbg commented Feb 20, 2015

This patch adds a HMAC-based Key Derivation Function (HKDF), as defined by RFC 5869.

It has its use-cases in encryption algorithms, and is very similar to PBKDF2, except it is not suitable for password storage and its inputs are supposed to be cryptographically strong and already suitable for usage as encryption keys.
The most common use case for HKDF would be to deriving a pair of keys for encrypt-and-HMAC when you already have a single key. And that is useful in PHP mostly for two reasons:

  • We don't yet have authenticated encryption modes (at least not outside of unstable PECL extensions with poor adoption rates).
  • It is significantly faster than PBKDF2, making it more suitable for usage in web applications.

Here's an example usage scenario:

  1. Derive a cryptographically strong $inputKey via PBKDF2 from a user password input.
  2. Derive a pair of keys from $inputKey via HKDF: $hmacKey, $encryptionKey
  3. You can now encrypt and HMAC with the produced keys, asking only for a single password from your users.

I can link to other (but similar) use cases with my own userland code if necessary, but I believe that this is enough. I hope that this can get merged before PHP7's feature freeze, if at all.

@narfbg
Copy link
Contributor Author

narfbg commented Feb 20, 2015

I'll see what the Travis failure is about ... didn't get that in my environment.

@narfbg
Copy link
Contributor Author

narfbg commented Feb 21, 2015

Build is still failing, but due to unrelated bug(s).

The 'ENABLE_MAINTAINER_ZTS=1 ENABLE_DEBUG=1' build in particular shows hash_hkdf_basic test as failed, with the following diff:

========DIFF========
043+ fnv164: 9e8849af1a6d29acjoaat: 1bde5739
043- fnv164: 9e8849af1a6d29ac
044- joaat: 1bde5739

But I've got no idea why this is happening, since the lines outputting it are the following:

echo 'fnv164: ', bin2hex(hash_hkdf('fnv164', $ikm)), "\n";
echo 'joaat: ', bin2hex(hash_hkdf('joaat', $ikm)), "\n";

So such an output is quite puzzling for me ... some kind of a parser bug?

}

if (salt != NULL && salt->len > 0) {
computed_salt = safe_emalloc(salt->len, salt->len, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be just emalloc(salt->len).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok ... does it have to be the same for prk on line 666 as well?

@jpauli jpauli added the Feature label Feb 23, 2015
@sarciszewski
Copy link
Contributor

I would love to see this. @narfbg can you rebase on master and see if the tests pass?

@narfbg
Copy link
Contributor Author

narfbg commented May 15, 2015

I will, but I'm afraid feature freeze blocks it ...

@krakjoe
Copy link
Member

krakjoe commented Jan 4, 2017

Security sensitive code requires the attention of all the programmers we have that are experienced in this area, they simply don't read this issue tracker. What I'm going to do is label this as an RFC.

Please take this action as encouragement to start an internals discussion (and accompanying RFC), if it emerges from the discussion that an RFC vote is not necessary, then I am comfortable merging it. I would start with that to attract the right kind of attention.

I'm very sorry that this sat here for so long, there's nothing I can do about that now, except push you forward today :)

@krakjoe krakjoe added RFC and removed Feature labels Jan 4, 2017
@narfbg
Copy link
Contributor Author

narfbg commented Jan 5, 2017

I'm pretty sure a handful of internals devs saw it at the time, as I asked a few people if this would need an RFC. IIRC, @ircmaxell did a quick review too.

In hindsight, I should've at least mailed internals@, indeed. Will do after I get the patch up to date.

But I'm also thinking of renaming it to just hkdf(), as it only has applications as a KDF and the hash_ prefix may imply something else to people. Not sure if it even has to remain in ext/hash. Thoughts?

@krakjoe
Copy link
Member

krakjoe commented Jan 5, 2017

It certainly is the sort of thing that can be merged without an RFC, since an RFC is not guaranteed to create the right kind of attention either, it's just that an internals discussion normally will, but will also lead to unclear consensus, so RFC is then required. As mentioned, no problem merging if I can get a thumbs up from a few people in the know. Internals communication is likely to be enough, I should hope.

You should probably bring up the question of naming on internals also, I'm afraid I'm too ignorant to give you good advice on this ... sorry about that.

@paragonie-scott
Copy link
Contributor

As mentioned, no problem merging if I can get a thumbs up from a few people in the know.

Consider this mine: 👍

@krakjoe krakjoe self-assigned this Jan 6, 2017
@nikic
Copy link
Member

nikic commented Jan 6, 2017

Please keep the hash_ prefix. This function is part of the hash extension, so it should have a hash prefix. Also, we have a hash_pbkdf2 function.

@nikic nikic self-requested a review January 6, 2017 17:41
Copy link
Member

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of general implementation notes... I haven't reviewed the crypto yet.

RETURN_FALSE;
}

if (ikm->len <= 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be ZSTR_LEN

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also checking for negative is not necessary here (in PHP 7 the length is size_t).

RETURN_FALSE;
}

if (salt != NULL && salt->len > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again ZSTR_LEN + ZSTR_VAL here and on the following lines.

for (i = 0; i < ops->digest_size; i++)
{
computed_salt[i] = 0x00;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memset might be more elegant here.

ops->hash_init(context);
K = emalloc(ops->block_size);
php_hash_hmac_prep_key(K, ops, context, computed_salt, (salt != NULL && salt->len ? salt->len : ops->digest_size));
prk = safe_emalloc(ops->digest_size, ops->digest_size, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is allocating memory for digest_size*digest_size bytes. Is that right?

// Expand
returnval = zend_string_alloc(length, 0);
digest = emalloc(ops->digest_size);
for (i = 1, rounds = ceil((float) length / (float) ops->digest_size); i <= rounds; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using floats here may result in rounding errors for large lengths. Please use ceil integer division instead.

@krakjoe krakjoe assigned nikic and unassigned krakjoe Jan 6, 2017
@narfbg
Copy link
Contributor Author

narfbg commented Jan 10, 2017

Please keep the hash_ prefix. This function is part of the hash extension, so it should have a hash prefix. Also, we have a hash_pbkdf2 function.

That was also my reasoning for naming it hash_hkdf() initially. But ideally, it should've been a part of a more generic crypto extension, which PHP currently doesn't have.

Unlike the other hash_ functions, its end goal is KDF and KDF only; especially unlike PBKDF2 - it's not suitable for e.g. password hashing. So removing the hash_ prefix is a way of discouraging misuse. Disagree?

I don't mind keeping the prefix that much, but I intended to ask that question on internals@. I'll have to squash those commits anyway after everything else is given the thumbs-up. :)

Edit: Forgot to mention that I applied the other requested changes ... Used ZSTR_LEN(), ZSTR_VAL() wherever a string input param was used; the safe_emalloc() call was indeed wrong - changed that; unsure about whether I did the ceil() replacement correctly.

@narfbg
Copy link
Contributor Author

narfbg commented Jan 11, 2017

Don't know why the AppVeyor test (1/2) is failing, seems unrelated.

@krakjoe
Copy link
Member

krakjoe commented Jan 11, 2017

@narfbg confirm, the failing CI is not connected to this change ...

@ircmaxell
Copy link
Contributor

@narfbg That's very true, however I would also argue that the KDFs specified here are all built off of hash primitives (hence the name HKDF), hence the connection to hash_*.

If we implemented cipher-based KDFs (though relatively rare compared to PBKDF and HKDF), those would go on the crypto extension, since there is no native crypto in PHP (and a cipher-based KDF would depend on a cipher).

@narfbg
Copy link
Contributor Author

narfbg commented Jan 11, 2017

Yea, I get that. Since you all are repeating basically the same argument, I guess we're learning towards the hash_ prefix.

I stumbled upon something else though - the entire thing blows up if you input a length lower than 8 (but higher than 0, as there's a check for that). And I don't know what causes it ... can anybody reproduce and/or help with this?

@nikic
Copy link
Member

nikic commented Jan 11, 2017

@narfbg Doing a wild guess, you're allocating the return value with precise length and then populating it in block size increments, so you're writing past the end of the allocation. Running under USE_ZEND_ALLOC=0 valgrind will help identify issues.

@narfbg
Copy link
Contributor Author

narfbg commented Jan 11, 2017

@nikic Indeed, that was it. Thanks.

@krakjoe
Copy link
Member

krakjoe commented Jan 11, 2017

Since @ircmaxell didn't say it's terrible, and @paragonie-scott has given us thumbs up, and @nikic approved the request ... I think we don't need an RFC, anyone disagree ?

@narfbg
Copy link
Contributor Author

narfbg commented Jan 11, 2017

Renamed back to hash_hkdf(), rebased against master again and squashed all the commits into 1.

@weltling
Copy link
Contributor

Keeping hash_ prefix makes pretty much sense. It is consistent with the current scheme and reduces the conflict potential with possible userspace implementations. IMO no RFC is required, as the functionality is missing in PHP and there are no current plans or implementation for a dedicated extension. Also the subject specific implementation seems to be approved.I'd say, it were safer not to merge into 7.0, still because of a naming conflict potential. But otherwise it might be ok in 7.1 with RMs decision or anyway in master. A doc patch were very desirable, of course :)

Thanks.

@weltling
Copy link
Contributor

@narfbg oh, either you're too fast in renaming or i'm too slow in typing messages :)

@narfbg
Copy link
Contributor Author

narfbg commented Jan 11, 2017

Likely both. :)

I'm not really sure which target version does master equal ... Guessing 7.2?

Copy link
Member

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more nits.

Checked this against the RFC, everything seems fine. We could optimize it a bit (e.g. hmac key reuse), but probably not worthwhile.


ops = php_hash_fetch_ops(ZSTR_VAL(algo), ZSTR_LEN(algo));
if (!ops) {
php_error_docref(NULL, E_WARNING, "Unknown hashing algorithm: %s", ZSTR_VAL(algo), ZSTR_LEN(algo));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The len argument is too much.

RETURN_FALSE;
}

if (salt != NULL && ZSTR_LEN(salt) > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The need to copy the salt here is a bit awkward ... I would suggest to instead directly pass salt ? ZSTR_VAL(salt) : "", salt ? ZSTR_LEN(salt) : 0 to hmac_prep_key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried this suggestion and it triggers failures ...

Plus, is that really the same as using str_repeat("\x0", DIGEST_SIZE)? Can't really find it by just skimming through the RFC, but that's what it's supposed to be, similarly to missing IVs in encryption.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It oughta be... https://github.com/narfbg/php-src/blob/9f3e4c6a3fbf3784d352882932b5dd3117016698/ext/hash/hash.c#L209 memsets K to 0 and the copies over the key. If the key has zero length it should leave K at all zero. As such not sure what is going wrong there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, seems logical but it doesn't work for some reason. I even tried this:

    if (salt != NULL && ZSTR_LEN(salt) > 0) {
            php_hash_hmac_prep_key(K, ops, context, ZSTR_VAL(salt), ZSTR_LEN(salt));
    }
    else {
            memset(K, 0x36, ops->block_size);
    }

... which should be the same end result. It doesn't crash and produces correct outputs, but the "basic functionality" test fails like this:

*** Testing hash_hkdf(): basic functionality ***
md2: 87779851d2377dab25da16fd7aadfdf5
md4: 422c6bd8dd2a6baae8abadef618c3ede
md5: 98b16391063ecee006a3ca8ee5776b1e
sha1: a71863230e3782240265126a53e137af6667e988
sha224: 51678ceb17e803505187b2cf6451c30fbc572fda165bb69bbd117c7a
sha256: d8f0bede4b652933c32a92eccf7723f7eeb4701744c81325dc3f0fa9fda24499
sha384: f600680e677bb417a7a22a4da8b167c0d91823a7a5d56a49aeb1838bb2320c05068d15d6d980824fee542a279d310c3a
sha512:
Warning: hash_hkdf(): Input keying material cannot be empty in ...

The IKM has nothing to do with that, so apparently, subsequent calls are affected some how ... I don't get it. Weird.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does valgrind say anything useful?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm new to that, but it did ... Problem now is that the fnv164 result doesn't match.

That's not a cryptographic function, so it shouldn't be used with HKDF anyway, but the target results in that test case were computed by my userland HKDF implementation, which I'm pretty confident in ...

I'll push the updated code before we go on with that.

ops->hash_init(context);
K = emalloc(ops->block_size);
php_hash_hmac_prep_key(K, ops, context, computed_salt, (salt != NULL && ZSTR_LEN(salt) ? ZSTR_LEN(salt) : ops->digest_size));
prk = safe_emalloc(ops->block_size, 1, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be just emalloc(ops->block_size). Multiplying by 1 is always safe ^^

php_hash_string_xor_char(K, K, 0x6A, ops->block_size);
php_hash_hmac_round(digest, ops, context, K, digest, ops->digest_size);
memcpy(
returnval->val + ((i - 1) * ops->digest_size),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

->val leftover here and on L699.

@nikic
Copy link
Member

nikic commented Jan 11, 2017

If @krakjoe is fine with it, I would go with 7.1 for this.

@krakjoe
Copy link
Member

krakjoe commented Jan 11, 2017

👍 for 7.1

@krakjoe krakjoe added Feature and removed RFC labels Jan 11, 2017
@narfbg
Copy link
Contributor Author

narfbg commented Jan 11, 2017

valgrind output that I don't understand: https://gist.github.com/narfbg/901f9093075879430d2a58e8b0b3888d

memset(K, 0x36, ops->block_size);
}

prk = emalloc(ops->block_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be ops->digest_size?

Copy link
Contributor Author

@narfbg narfbg Jan 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, though it doesn't seem to change anything.

Edit: Actually, it did change the valgrind output - updated that too.

@nikic
Copy link
Member

nikic commented Jan 12, 2017

prepare_key assumes that block_size >= digest_size. For FNV164 we define digest_size=8 and block_size=4. I don't know whether the data is wrong or this hash is just unusable with hmac (in which case it should be forbidden).

@narfbg
Copy link
Contributor Author

narfbg commented Jan 12, 2017

I did some research and it turns out it is unusable, regardless of the block size which is also wrong - no FNV variation divides in blocks ... technically. It simply processes one byte at a time.

Also found Rust and Go libraries that define its block size as 1.

Regardless of the incorrect block size, known test vectors (an IETF Draft - defines only FNV-1a vectors; source code written by the N in FNV) match the results produced by hash().

If we'd have to explicitly blacklist FNV here, I'd say let's also blacklist all non-cryptographic hash functions ... Doing this for hash_hmac() should probably wait for 7.2 though.

Edit: We don't have a flag to mark hashes as a (non-)cryptographic, do we? :/

@nikic
Copy link
Member

nikic commented Jan 12, 2017

If we'd have to explicitly blacklist FNV here, I'd say let's also blacklist all non-cryptographic hash functions ... Doing this for hash_hmac() should probably wait for 7.2 though.

Edit: We don't have a flag to mark hashes as a (non-)cryptographic, do we? :/

Sounds reasonable. For this PR you can just add a function which explicitly checks against a blacklist. For master it would be good to extend the hash_ops struct with a field for whether the hash is cryptographic and then check against that in hmac, pbkdf2 and hkdf.

@narfbg
Copy link
Contributor Author

narfbg commented Jan 12, 2017

I suppose I should change the PR target to PHP-7.1 then?

@krakjoe
Copy link
Member

krakjoe commented Jan 12, 2017

@narfbg Don't rebase on current 7.1 branch ... I messed it up ... http://news.php.net/php.internals/97725

@nikic
Copy link
Member

nikic commented Jan 12, 2017

No need to explicitly rebase, we can easily do that while merging.

@nikic
Copy link
Member

nikic commented Jan 14, 2017

Merged via 4bf7ef0 into PHP-7.1. This will be available in PHP 7.1.2. Thanks!

Do you plan to submit a PR for the blacklisting of non-crypto hashes for HMAC/PBKDF2 for master?

@nikic nikic closed this Jan 14, 2017
@narfbg
Copy link
Contributor Author

narfbg commented Jan 16, 2017

@nikic I do, will submit it shortly. And thank you as well for helping me with this. :)

@krakjoe
Copy link
Member

krakjoe commented Jan 16, 2017

Excellent stuff peepz #teamwork #winning

@narfbg
Copy link
Contributor Author

narfbg commented Jan 16, 2017

See #2312 for the flag-based blacklist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants