Skip to content

PHPC-2206: Implement BSON\Value class #1412

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 44 commits into from
Jun 15, 2023

Conversation

alcaeus
Copy link
Member

@alcaeus alcaeus commented Apr 21, 2023

PHPC-2206

This PR introduces the MongoDB\BSON\Value class as a wrapper for bson_value_t. The Document, PackedArray, and Iterator classes will be updated to return value instances instead of native values. This allows for a better handling of BSON types whenever raw BSON is consumed. For example, by using the Value class, users are able to differentiate between a 32-bit integers and 64-bit integers, even when the value does not fall in the 64-bit range.

The class contains a generic getValue method that will return the value contained in the Value instance. For documents and arrays, this returns Document and PackedArray instances respectively. For BSON types, it returns the corresponding MongoDB\BSON\Type instance, and for primitives it will return the PHP value.

In addition to the getValue method, it comes with a set of is* and get* methods for each BSON type. For added convenience, this includes accessors for integers (if one does not need to differentiate between 32-bit and 64-bit values), as well as numbers in general (if one does not need to differentiate between integers and doubles).

The PR also extracts some previously static factories from phongo_bson.c to the various BSON source files, as they are now used in multiple places. This will eventually be extracted into its own pull request. For the time being, the PR does not change the encoding and decoding of bson_value_t. As is, a bson_value_t is converted to a zval by adding it to an empty document, converting the document and extracting the relevant key. When converting a zval to bson_value_t this is handled the same way. Ideally, we'd skip this round-tripping and work with the bson_value_t directly. I'd like to complete this in a separate pull request as this is a larger effort if we want to avoid code duplication.

TODO:

  • Return Value instances when accessing document offsets
  • Return Value instances when accessing array offsets
  • Return Value instances in BSON iterator
  • Refactor encoding and decoding of bson_value_t instances.

@alcaeus alcaeus self-assigned this Apr 21, 2023
@alcaeus alcaeus force-pushed the phpc-2206-bson-value branch 2 times, most recently from 2bc452c to 0e42526 Compare April 26, 2023 06:38
@alcaeus alcaeus force-pushed the phpc-2206-bson-value branch from 6372cc3 to 836c393 Compare May 12, 2023 08:31
@alcaeus alcaeus force-pushed the phpc-2206-bson-value branch 2 times, most recently from bcd7cb3 to 5e0a49c Compare May 24, 2023 10:35
@alcaeus alcaeus marked this pull request as ready for review May 24, 2023 11:01
@alcaeus alcaeus requested a review from jmikola May 24, 2023 11:01
@alcaeus alcaeus force-pushed the phpc-2206-bson-value branch 2 times, most recently from af94ce0 to c9a0a80 Compare June 12, 2023 11:58
@alcaeus alcaeus requested a review from jmikola June 12, 2023 13:21
Copy link
Member

@jmikola jmikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few questions and some things to follow up on, but I think we're all set here.

Enjoy your customary Gordon Ramsey GIF:

.

final public function serialize(): string {}

/** @param string $serialized */
final public function unserialize($serialized): void {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this is consistent with every other stub, but why do we have an @param annotation here?

The interface has a string type since php/php-src@273731f#diff-c8adc7ea754be4705d22956277447f2a8cdcd19e3185caa1513391d864a82384R44

It had no type hint in 7.4: https://github.com/php/php-src/blob/dca8d5565b947270a29f232bc54efd9df3b92b94/Zend/zend_interfaces.c#L572-L580

Given that, should we be using #if PHP_VERSION_ID >= 80000 to conditionally declare a type hint? If so, can you create a ticket for that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is due to the stub tool (gen_stub.php), which complains about parameters without types. Each parameter either needs to have a declared type or a corresponding @param annotation. The same goes for return types, where a return type or @return annotation are necessary.


if (!mongoc_client_encryption_decrypt(clientencryption->client_encryption, &ciphertext, &value, &error)) {
phongo_throw_exception_from_bson_error_t(&error);
goto cleanup;
}

if (!php_phongo_bson_value_to_zval(&value, zvalue)) {
/** Use the legacy decoder to return PHP types instead of BSON types */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️ Outstanding question


bson = bson_new_from_data(intern->value.value.v_doc.data, intern->value.value.v_doc.data_len);

phongo_packedarray_new(return_value, bson, false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not following up here. A separate ticket SGTM to we don't lose track of this.

@@ -22,6 +22,10 @@
#include <Zend/zend_string.h>
#include <Zend/zend_portability.h>

/* Include stdbool.h as it might not have been implicitly loaded yet */
#include <stdbool.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still relevant? I don't see bool used anywhere in this file. Just zend_bool.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right - this was left over from a previous version. I've removed it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction: this is still necessary, at least on some versions of PHP. With PHP 8 or newer, zend_bool was changed to use a standard bool instead of a unsigned char (see php/php-src@4b77a15). This adds an include for stdbool.h in zend_types.h, which we include in phongo_compat.h.

We can either leave the include in place here (to ensure we have stdbool.h available when needed on PHP < 8) or include it explicitly in every file that makes use of a bool type. I'm fine leaving it in phongo_compat.h for the time being.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If php/php-src@4b77a15 is the first place zend_bool is defined, I don't understand how stdbool.h would not already be included by that point. We're already including Zend/zend_types.h at the top of this file.

I remember encountering this before but I never fully wrapped my head around it.

OK to leave as-is but we can consider opening a ticket to get to the bottom of it later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For PHP versions before 8.0 zend_types.h does not include stdbool.h, triggering the error there. Given that, I think including it in phongo_compat.h makes sense.

@@ -921,7 +921,8 @@ static void phongo_clientencryption_encrypt(php_phongo_clientencryption_t* clien
goto cleanup;
}

if (!php_phongo_bson_value_to_zval(&ciphertext, zciphertext)) {
/** Use the legacy decoder to return PHP types instead of BSON types */
if (!phongo_bson_value_to_zval_legacy(&ciphertext, zciphertext)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️ Outstanding question

alcaeus added 22 commits June 15, 2023 08:41
This includes a small change to return PHP objects for compatibility in ClientEncryption::decrypt and ClientEncryption::encrypt
With the new logic, debug output will always show raw BSON data and PHP data for documents and packed arrays. Serialised data will only contain raw BSON data. This also changes Int64 serialisation to always serialise as Int64 instances, regardless of value or platform.
@alcaeus alcaeus force-pushed the phpc-2206-bson-value branch from c9a0a80 to d1eaf0f Compare June 15, 2023 06:41
alcaeus added 2 commits June 15, 2023 08:47
Since the encrypted data will always be a BSON binary object, there is no reason to use the legacy function to avoid receiving a Document or PackedArray instance.
@alcaeus alcaeus force-pushed the phpc-2206-bson-value branch from d1eaf0f to d96bc6c Compare June 15, 2023 06:47
@alcaeus
Copy link
Member Author

alcaeus commented Jun 15, 2023

Ignoring Windows test failures for the time being. There were successful Windows builds on PHP 7.2 (which is usually the most critical version). That said, we'll have to revisit these tests as they are quite fragile (usually due to the same timeout when starting MongoDB).

@alcaeus alcaeus merged commit 358fd2f into mongodb:master Jun 15, 2023
@alcaeus alcaeus deleted the phpc-2206-bson-value branch June 15, 2023 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants