Skip to content

PHPLIB-1129: Improve pipeline checks for replace operations #1098

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 9, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 10 additions & 4 deletions psalm-baseline.xml
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,9 @@
</UnsafeInstantiation>
</file>
<file src="src/Exception/InvalidArgumentException.php">
<UnsafeInstantiation occurrences="1">
<UnsafeInstantiation occurrences="2">
<code>new static(sprintf('Expected %s to have type "%s" but found "%s"', $name, $expectedType, get_debug_type($value)))</code>
<code>new static('Expected update operator(s) or non-empty pipeline for ' . $name)</code>
</UnsafeInstantiation>
</file>
<file src="src/Exception/ResumeTokenException.php">
Expand Down Expand Up @@ -308,7 +309,7 @@
<DocblockTypeContradiction occurrences="1">
<code>is_array($operation)</code>
</DocblockTypeContradiction>
<MixedArgument occurrences="12">
<MixedArgument occurrences="13">
<code>$args</code>
<code>$args</code>
<code>$args</code>
Expand All @@ -320,9 +321,10 @@
<code>$args[1]</code>
<code>$args[1]</code>
<code>$args[1]</code>
<code>$args[1]</code>
<code>$args[2]</code>
</MixedArgument>
<MixedArrayAccess occurrences="25">
<MixedArrayAccess occurrences="28">
<code>$args[0]</code>
<code>$args[0]</code>
<code>$args[0]</code>
Expand All @@ -337,6 +339,9 @@
<code>$args[1]</code>
<code>$args[1]</code>
<code>$args[1]</code>
<code>$args[1]</code>
<code>$args[1]</code>
<code>$args[1]</code>
<code>$args[2]</code>
<code>$args[2]</code>
<code>$args[2]</code>
Expand All @@ -349,7 +354,8 @@
<code>$args[2]['upsert']</code>
<code>$args[2]['upsert']</code>
</MixedArrayAccess>
<MixedArrayAssignment occurrences="11">
<MixedArrayAssignment occurrences="12">
<code>$args[1]</code>
<code>$args[1]</code>
<code>$args[1]</code>
<code>$args[2]</code>
Expand Down
11 changes: 10 additions & 1 deletion src/Operation/BulkWrite.php
Original file line number Diff line number Diff line change
Expand Up @@ -191,10 +191,19 @@ public function __construct(string $databaseName, string $collectionName, array
throw InvalidArgumentException::invalidType(sprintf('$operations[%d]["%s"][1]', $i, $type), $args[1], 'array or object');
}

// Treat empty arrays as replacement documents for BC
if ($args[1] === []) {
$args[1] = (object) $args[1];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why you need to convert the empty array into an empty object. Replacement documents can be array, isn't it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this, is_pipeline() with allowEmpty: true is going to consider the argument a pipeline and throw an exception (since pipelines are only acceptable for update operations).

Since is_pipeline() historically returned false for empty arrays, an empty array was always accepted for ReplaceOne and would be encoded as a BSON object in the update command (per libmongoc's internal logic discussed in CDRIVER-4658).

Since this PR is adding logic to reject empty pipelines for both update and replace operations, we need to explicitly allow empty arrays for BC.

BulkWrite replaceOne operations historically has never used is_pipeline(), so pipelines could have snuck through there. Empty PHP arrays would still be encoded as a replacement document, though (same libmongoc logic).

FindOneAndReplace also never used is_pipeline() during constructor validation. The only relevant code path was FindAndModify::createCommandDocument() conditionally casting the update option to an object if is_pipeline() returned false (see: this change from PHPLIB-418). I actually think there was a bug there, as an empty array would have been encoded as a BSON array in the findAndModify command -- and thus treated as an empty pipeline (i.e. NOP) in MongoDB 4.2+.

By prohibiting empty pipelines for both updates and replacements, we resolve the last issue with FindOneAndReplace and also enable users to express empty replacement documents as [] like they already could for ReplaceOne and BulkWrite replaceOne (where they previously might have had to pass (object) [] explicitly).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think there was a bug there, as an empty array would have been encoded as a BSON array in the findAndModify command -- and thus treated as an empty pipeline (i.e. NOP) in MongoDB 4.2+.

Correction. Since FindOneAndReplace previously never validated its $replacement parameter with is_pipeline(), an empty BSON array would still have correctly encoded as a BSON document due to this logic:

if (isset($this->options['update'])) {
$cmd['update'] = is_pipeline($this->options['update'])
? $this->options['update']
: (object) $this->options['update'];
}

is_pipeline() returns false and the parameter gets cast to an object.

So there was no bug.

Also, I can confirm that the cast in FindAndModify::createCommandDocument() is still necessary. If we execute FindAndModify directly (as is done in the test suite), then we rely on that code to ensure an empty array encodes properly.

Technically, we could also change the constructor logic in FindOneAndReplace to cast all $replacement arrays to objects is_pipeline($replacement) is false, but it's that just adds additional overhead and wouldn't cover the case where FindAndModify is executed directly. The empty array check is cheap and plays nicely with the subsequent validation that rejects any pipeline (empty and non-empty alike).

}

if (is_first_key_operator($args[1])) {
throw new InvalidArgumentException(sprintf('First key in $operations[%d]["%s"][1] is an update operator', $i, $type));
}

if (is_pipeline($args[1], true /* allowEmpty */)) {
Copy link
Member

@GromNaN GromNaN Jun 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't need such comment with phpstorm. It may be replaced by named argument when the lib is upgraded to PHP 8+.

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. How does PhpStorm decide to show the second parameter name but not the first ($args[1])? It looks like it shows the first parameter for sprintf() on the second line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hint is not shown when a variable is passed, it's shown for static values only.

throw new InvalidArgumentException(sprintf('$operations[%d]["%s"][1] is an update pipeline', $i, $type));
}

if (! isset($args[2])) {
$args[2] = [];
}
Expand Down Expand Up @@ -229,7 +238,7 @@ public function __construct(string $databaseName, string $collectionName, array
}

if (! is_first_key_operator($args[1]) && ! is_pipeline($args[1])) {
throw new InvalidArgumentException(sprintf('First key in $operations[%d]["%s"][1] is neither an update operator nor a pipeline', $i, $type));
throw new InvalidArgumentException(sprintf('Expected update operator(s) or non-empty pipeline for $operations[%d]["%s"][1]', $i, $type));
}

if (! isset($args[2])) {
Expand Down
8 changes: 8 additions & 0 deletions src/Operation/FindAndModify.php
Original file line number Diff line number Diff line change
Expand Up @@ -295,6 +295,14 @@ private function createCommandDocument(): array
if (isset($this->options['update'])) {
/** @psalm-var array|object */
$update = $this->options['update'];
/* A non-empty pipeline will encode as a BSON array, so leave it
* as-is. Cast anything else to an object since a BSON document is
* likely expected. This includes empty arrays, which historically
* can be used to represent empty replacement documents.
*
* This also allows an empty pipeline expressed as a PackedArray or
* Serializable to still encode as a BSON array, since the object
* cast will have no effect. */
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to add a comment explaining this conditional cast after diving into it in #1098 (comment).

$cmd['update'] = is_pipeline($update) ? $update : (object) $update;
}

Expand Down
12 changes: 11 additions & 1 deletion src/Operation/FindOneAndReplace.php
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
use function is_integer;
use function is_object;
use function MongoDB\is_first_key_operator;
use function MongoDB\is_pipeline;

/**
* Operation for replacing a document with the findAndModify command.
Expand Down Expand Up @@ -109,8 +110,17 @@ public function __construct(string $databaseName, string $collectionName, $filte
throw InvalidArgumentException::invalidType('$replacement', $replacement, 'array or object');
}

// Treat empty arrays as replacement documents for BC
if ($replacement === []) {
$replacement = (object) $replacement;
}

if (is_first_key_operator($replacement)) {
throw new InvalidArgumentException('First key in $replacement argument is an update operator');
throw new InvalidArgumentException('First key in $replacement is an update operator');
}

if (is_pipeline($replacement, true /* allowEmpty */)) {
throw new InvalidArgumentException('$replacement is an update pipeline');
}

if (isset($options['projection']) && ! is_array($options['projection']) && ! is_object($options['projection'])) {
Expand Down
2 changes: 1 addition & 1 deletion src/Operation/FindOneAndUpdate.php
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ public function __construct(string $databaseName, string $collectionName, $filte
}

if (! is_first_key_operator($update) && ! is_pipeline($update)) {
throw new InvalidArgumentException('Expected an update document with operator as first key or a pipeline');
throw new InvalidArgumentException('Expected update operator(s) or non-empty pipeline for $update');
}

if (isset($options['projection']) && ! is_array($options['projection']) && ! is_object($options['projection'])) {
Expand Down
11 changes: 8 additions & 3 deletions src/Operation/ReplaceOne.php
Original file line number Diff line number Diff line change
Expand Up @@ -85,12 +85,17 @@ public function __construct(string $databaseName, string $collectionName, $filte
throw InvalidArgumentException::invalidType('$replacement', $replacement, 'array or object');
}

// Treat empty arrays as replacement documents for BC
if ($replacement === []) {
$replacement = (object) $replacement;
}

if (is_first_key_operator($replacement)) {
throw new InvalidArgumentException('First key in $replacement argument is an update operator');
throw new InvalidArgumentException('First key in $replacement is an update operator');
}

if (is_pipeline($replacement)) {
throw new InvalidArgumentException('$replacement argument is a pipeline');
if (is_pipeline($replacement, true /* allowEmpty */)) {
throw new InvalidArgumentException('$replacement is an update pipeline');
}

$this->update = new Update(
Expand Down
2 changes: 1 addition & 1 deletion src/Operation/Update.php
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,7 @@ public function __construct(string $databaseName, string $collectionName, $filte
}

if ($options['multi'] && ! is_first_key_operator($update) && ! is_pipeline($update)) {
throw new InvalidArgumentException('"multi" option cannot be true if $update is a replacement document');
throw new InvalidArgumentException('"multi" option cannot be true unless $update has update operator(s) or non-empty pipeline');
}

if (isset($options['session']) && ! $options['session'] instanceof Session) {
Expand Down
2 changes: 1 addition & 1 deletion src/Operation/UpdateMany.php
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ public function __construct(string $databaseName, string $collectionName, $filte
}

if (! is_first_key_operator($update) && ! is_pipeline($update)) {
throw new InvalidArgumentException('Expected an update document with operator as first key or a pipeline');
throw new InvalidArgumentException('Expected update operator(s) or non-empty pipeline for $update');
}

$this->update = new Update(
Expand Down
2 changes: 1 addition & 1 deletion src/Operation/UpdateOne.php
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ public function __construct(string $databaseName, string $collectionName, $filte
}

if (! is_first_key_operator($update) && ! is_pipeline($update)) {
throw new InvalidArgumentException('Expected an update document with operator as first key or a pipeline');
throw new InvalidArgumentException('Expected update operator(s) or non-empty pipeline for $update');
}

$this->update = new Update(
Expand Down
23 changes: 19 additions & 4 deletions src/functions.php
Original file line number Diff line number Diff line change
Expand Up @@ -213,7 +213,22 @@ function is_first_key_operator($document): bool
}

/**
* Returns whether an update specification is a valid aggregation pipeline.
* Returns whether the argument is a valid aggregation or update pipeline.
*
* This is primarily used for validating arguments for update and replace
* operations, but can also be used for validating an aggregation pipeline.
*
* The $allowEmpty parameter can be used to control whether an empty array
* should be considered a valid pipeline. Empty arrays are generally valid for
* an aggregation pipeline, but the things are more complicated for update
* pipelines.
*
* Update operations must prohibit empty pipelines, since libmongoc may encode
* an empty pipeline array as an empty replacement document when writing an
* update command (arrays and documents have the same bson_t representation).
* For consistency, findOneAndUpdate should also prohibit empty pipelines.
* Replace operations (e.g. replaceOne, findOneAndReplace) should reject empty
* and non-empty pipelines alike, since neither is a replacement document.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this insider comment.

*
* Note: this method may propagate an InvalidArgumentException from
* document_or_array() if a Serializable object within the pipeline array
Expand All @@ -223,11 +238,11 @@ function is_first_key_operator($document): bool
* @param array|object $pipeline
* @throws InvalidArgumentException
*/
function is_pipeline($pipeline): bool
function is_pipeline($pipeline, bool $allowEmpty = false): bool
{
if ($pipeline instanceof PackedArray) {
/* Nested documents and arrays are intentionally left as BSON. We avoid
* iterator_to_array() since Document iteration returns all values as
* iterator_to_array() since PackedArray iteration returns all values as
* MongoDB\BSON\Value instances. */
/** @psalm-var array */
$pipeline = $pipeline->toPHP([
Expand All @@ -244,7 +259,7 @@ function is_pipeline($pipeline): bool
}

if ($pipeline === []) {
return false;
return $allowEmpty;
}

$expectedKey = 0;
Expand Down
49 changes: 0 additions & 49 deletions tests/Operation/BulkWriteFunctionalTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,11 @@

use MongoDB\BSON\Document;
use MongoDB\BSON\ObjectId;
use MongoDB\BSON\PackedArray;
use MongoDB\BulkWriteResult;
use MongoDB\Collection;
use MongoDB\Driver\BulkWrite as Bulk;
use MongoDB\Driver\WriteConcern;
use MongoDB\Exception\BadMethodCallException;
use MongoDB\Model\BSONArray;
use MongoDB\Model\BSONDocument;
use MongoDB\Operation\BulkWrite;
use MongoDB\Tests\CommandObserver;
Expand Down Expand Up @@ -177,18 +175,6 @@ function (array $event) use ($expectedFilter): void {
);
}

public function provideFilterDocuments(): array
{
$expectedQuery = (object) ['x' => 1];

return [
'array' => [['x' => 1], $expectedQuery],
'object' => [(object) ['x' => 1], $expectedQuery],
'Serializable' => [new BSONDocument(['x' => 1]), $expectedQuery],
'Document' => [Document::fromPHP(['x' => 1]), $expectedQuery],
];
}

/** @dataProvider provideReplacementDocuments */
public function testReplacementDocuments($replacement, stdClass $expectedReplacement): void
{
Expand All @@ -208,18 +194,6 @@ function (array $event) use ($expectedReplacement): void {
);
}

public function provideReplacementDocuments(): array
{
$expected = (object) ['x' => 1];

return [
'replacement:array' => [['x' => 1], $expected],
'replacement:object' => [(object) ['x' => 1], $expected],
'replacement:Serializable' => [new BSONDocument(['x' => 1]), $expected],
'replacement:Document' => [Document::fromPHP(['x' => 1]), $expected],
];
}

/**
* @dataProvider provideUpdateDocuments
* @dataProvider provideUpdatePipelines
Expand Down Expand Up @@ -250,29 +224,6 @@ function (array $event) use ($expectedUpdate): void {
);
}

public function provideUpdateDocuments(): array
{
$expected = (object) ['$set' => (object) ['x' => 1]];

return [
'update:array' => [['$set' => ['x' => 1]], $expected],
'update:object' => [(object) ['$set' => ['x' => 1]], $expected],
'update:Serializable' => [new BSONDocument(['$set' => ['x' => 1]]), $expected],
'update:Document' => [Document::fromPHP(['$set' => ['x' => 1]]), $expected],
];
}

public function provideUpdatePipelines(): array
{
$expected = [(object) ['$set' => (object) ['x' => 1]]];

return [
'pipeline:array' => [[['$set' => ['x' => 1]]], $expected],
'pipeline:Serializable' => [new BSONArray([['$set' => ['x' => 1]]]), $expected],
'pipeline:PackedArray' => [PackedArray::fromPHP([['$set' => ['x' => 1]]]), $expected],
];
}

public function testDeletes(): void
{
$this->createFixtures(4);
Expand Down
Loading