-
Notifications
You must be signed in to change notification settings - Fork 266
PHPLIB-1129: Improve pipeline checks for replace operations #1098
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -191,10 +191,19 @@ public function __construct(string $databaseName, string $collectionName, array | |
throw InvalidArgumentException::invalidType(sprintf('$operations[%d]["%s"][1]', $i, $type), $args[1], 'array or object'); | ||
} | ||
|
||
// Treat empty arrays as replacement documents for BC | ||
if ($args[1] === []) { | ||
$args[1] = (object) $args[1]; | ||
} | ||
|
||
if (is_first_key_operator($args[1])) { | ||
throw new InvalidArgumentException(sprintf('First key in $operations[%d]["%s"][1] is an update operator', $i, $type)); | ||
} | ||
|
||
if (is_pipeline($args[1], true /* allowEmpty */)) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Interesting. How does PhpStorm decide to show the second parameter name but not the first ( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The hint is not shown when a variable is passed, it's shown for static values only. |
||
throw new InvalidArgumentException(sprintf('$operations[%d]["%s"][1] is an update pipeline', $i, $type)); | ||
} | ||
|
||
if (! isset($args[2])) { | ||
$args[2] = []; | ||
} | ||
|
@@ -229,7 +238,7 @@ public function __construct(string $databaseName, string $collectionName, array | |
} | ||
|
||
if (! is_first_key_operator($args[1]) && ! is_pipeline($args[1])) { | ||
throw new InvalidArgumentException(sprintf('First key in $operations[%d]["%s"][1] is neither an update operator nor a pipeline', $i, $type)); | ||
throw new InvalidArgumentException(sprintf('Expected update operator(s) or non-empty pipeline for $operations[%d]["%s"][1]', $i, $type)); | ||
} | ||
|
||
if (! isset($args[2])) { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -295,6 +295,14 @@ private function createCommandDocument(): array | |
if (isset($this->options['update'])) { | ||
/** @psalm-var array|object */ | ||
$update = $this->options['update']; | ||
/* A non-empty pipeline will encode as a BSON array, so leave it | ||
* as-is. Cast anything else to an object since a BSON document is | ||
* likely expected. This includes empty arrays, which historically | ||
* can be used to represent empty replacement documents. | ||
* | ||
* This also allows an empty pipeline expressed as a PackedArray or | ||
* Serializable to still encode as a BSON array, since the object | ||
* cast will have no effect. */ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wanted to add a comment explaining this conditional cast after diving into it in #1098 (comment). |
||
$cmd['update'] = is_pipeline($update) ? $update : (object) $update; | ||
} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -213,7 +213,22 @@ function is_first_key_operator($document): bool | |
} | ||
|
||
/** | ||
* Returns whether an update specification is a valid aggregation pipeline. | ||
* Returns whether the argument is a valid aggregation or update pipeline. | ||
* | ||
* This is primarily used for validating arguments for update and replace | ||
* operations, but can also be used for validating an aggregation pipeline. | ||
* | ||
* The $allowEmpty parameter can be used to control whether an empty array | ||
* should be considered a valid pipeline. Empty arrays are generally valid for | ||
* an aggregation pipeline, but the things are more complicated for update | ||
* pipelines. | ||
* | ||
* Update operations must prohibit empty pipelines, since libmongoc may encode | ||
* an empty pipeline array as an empty replacement document when writing an | ||
* update command (arrays and documents have the same bson_t representation). | ||
* For consistency, findOneAndUpdate should also prohibit empty pipelines. | ||
* Replace operations (e.g. replaceOne, findOneAndReplace) should reject empty | ||
* and non-empty pipelines alike, since neither is a replacement document. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for this insider comment. |
||
* | ||
* Note: this method may propagate an InvalidArgumentException from | ||
* document_or_array() if a Serializable object within the pipeline array | ||
|
@@ -223,11 +238,11 @@ function is_first_key_operator($document): bool | |
* @param array|object $pipeline | ||
* @throws InvalidArgumentException | ||
*/ | ||
function is_pipeline($pipeline): bool | ||
function is_pipeline($pipeline, bool $allowEmpty = false): bool | ||
{ | ||
if ($pipeline instanceof PackedArray) { | ||
/* Nested documents and arrays are intentionally left as BSON. We avoid | ||
* iterator_to_array() since Document iteration returns all values as | ||
* iterator_to_array() since PackedArray iteration returns all values as | ||
* MongoDB\BSON\Value instances. */ | ||
/** @psalm-var array */ | ||
$pipeline = $pipeline->toPHP([ | ||
|
@@ -244,7 +259,7 @@ function is_pipeline($pipeline): bool | |
} | ||
|
||
if ($pipeline === []) { | ||
return false; | ||
return $allowEmpty; | ||
} | ||
|
||
$expectedKey = 0; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get why you need to convert the empty array into an empty object. Replacement documents can be array, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this,
is_pipeline()
withallowEmpty: true
is going to consider the argument a pipeline and throw an exception (since pipelines are only acceptable for update operations).Since
is_pipeline()
historically returnedfalse
for empty arrays, an empty array was always accepted for ReplaceOne and would be encoded as a BSON object in theupdate
command (per libmongoc's internal logic discussed in CDRIVER-4658).Since this PR is adding logic to reject empty pipelines for both update and replace operations, we need to explicitly allow empty arrays for BC.
BulkWrite
replaceOne
operations historically has never usedis_pipeline()
, so pipelines could have snuck through there. Empty PHP arrays would still be encoded as a replacement document, though (same libmongoc logic).FindOneAndReplace also never used
is_pipeline()
during constructor validation. The only relevant code path wasFindAndModify::createCommandDocument()
conditionally casting theupdate
option to an object ifis_pipeline()
returned false (see: this change from PHPLIB-418). I actually think there was a bug there, as an empty array would have been encoded as a BSON array in thefindAndModify
command -- and thus treated as an empty pipeline (i.e. NOP) in MongoDB 4.2+.By prohibiting empty pipelines for both updates and replacements, we resolve the last issue with FindOneAndReplace and also enable users to express empty replacement documents as
[]
like they already could for ReplaceOne and BulkWritereplaceOne
(where they previously might have had to pass(object) []
explicitly).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correction. Since FindOneAndReplace previously never validated its
$replacement
parameter withis_pipeline()
, an empty BSON array would still have correctly encoded as a BSON document due to this logic:mongo-php-library/src/Operation/FindAndModify.php
Lines 295 to 299 in 3a681a3
is_pipeline()
returns false and the parameter gets cast to an object.So there was no bug.
Also, I can confirm that the cast in
FindAndModify::createCommandDocument()
is still necessary. If we execute FindAndModify directly (as is done in the test suite), then we rely on that code to ensure an empty array encodes properly.Technically, we could also change the constructor logic in FindOneAndReplace to cast all
$replacement
arrays to objectsis_pipeline($replacement)
is false, but it's that just adds additional overhead and wouldn't cover the case where FindAndModify is executed directly. The empty array check is cheap and plays nicely with the subsequent validation that rejects any pipeline (empty and non-empty alike).