Skip to content

DOCSP-41988: Aggregation #115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
205 changes: 205 additions & 0 deletions source/aggregation.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
.. _php-aggregation:

====================================
Transform Your Data with Aggregation
====================================

.. facet::
:name: genre
:values: reference

.. meta::
:keywords: code example, transform, computed, pipeline
:description: Learn how to use the PHP library to perform aggregation operations.

.. contents:: On this page
:local:
:backlinks: none
:depth: 2
:class: singlecol

.. TODO:
.. toctree::
:titlesonly:
:maxdepth: 1

/aggregation/aggregation-tutorials

Overview
--------

In this guide, you can learn how to use the {+php-library+} to perform
**aggregation operations**.

Aggregation operations process data in your MongoDB collections and
return computed results. The MongoDB Aggregation framework, which is
part of the Query API, is modeled on the concept of data processing
pipelines. Documents enter a pipeline that contains one or more stages,
and this pipeline transforms the documents into an aggregated result.

An aggregation operation is similar to a car factory. A car factory has
an assembly line, which contains assembly stations with specialized
tools to do specific jobs, like drills and welders. Raw parts enter the
factory, and then the assembly line transforms and assembles them into a
finished product.

The **aggregation pipeline** is the assembly line, **aggregation stages** are the
assembly stations, and **operator expressions** are the
specialized tools.

Aggregation Versus Find Operations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can use find operations to perform the following actions:

- Select which documents to return
- Select which fields to return
- Sort the results

You can use aggregation operations to perform the following actions:

- Run find operations
- Rename fields
- Calculate fields
- Summarize data
- Group values

Limitations
~~~~~~~~~~~

Consider the following limitations when performing aggregation operations:

- Returned documents cannot violate the
:manual:`BSON document size limit </reference/limits/#mongodb-limit-BSON-Document-Size>`
of 16 megabytes.
- Pipeline stages have a memory limit of 100 megabytes by default. You can exceed this
limit by creating an options array that sets the ``allowDiskUse`` option to ``true``
and passing the array to the ``MongoDB\Collection::aggregate()`` method.

.. important:: $graphLookup Exception

The :manual:`$graphLookup
</reference/operator/aggregation/graphLookup/>` stage has a strict
memory limit of 100 megabytes and ignores the ``allowDiskUse`` option.

.. _php-aggregation-example:

Aggregation Example
-------------------

.. note::

The examples in this guide use the ``restaurants`` collection in the ``sample_restaurants``
database from the :atlas:`Atlas sample datasets </sample-data>`. To learn how to create a
free MongoDB Atlas cluster and load the sample datasets, see the :atlas:`Get Started with Atlas
</getting-started>` guide.

To perform an aggregation, pass an array containing the pipeline stages to
the ``MongoDB\Collection::aggregate()`` method.

The following code example produces a count of the number of bakeries in each borough
of New York. To do so, it uses an aggregation pipeline that contains the following stages:

- :manual:`$match </reference/operator/aggregation/match/>` stage to filter for documents
in which the ``cuisine`` field contains the value ``'Bakery'``

- :manual:`$group </reference/operator/aggregation/group/>` stage to group the matching
documents by the ``borough`` field, accumulating a count of documents for each distinct
value

.. io-code-block::
:copyable:

.. input:: /includes/aggregation.php
:start-after: start-match-group
:end-before: end-match-group
:language: php
:dedent:

.. output::
:visible: false

{"_id":"Brooklyn","count":173}
{"_id":"Queens","count":204}
{"_id":"Bronx","count":71}
{"_id":"Staten Island","count":20}
{"_id":"Missing","count":2}
{"_id":"Manhattan","count":221}

Explain an Aggregation
~~~~~~~~~~~~~~~~~~~~~~

To view information about how MongoDB executes your operation, you can
instruct the MongoDB query planner to **explain** it. When MongoDB explains
an operation, it returns **execution plans** and performance statistics.
An execution plan is a potential way in which MongoDB can complete an operation.
When you instruct MongoDB to explain an operation, it returns both the
plan MongoDB executed and any rejected execution plans.

To explain an aggregation operation, construct a ``MongoDB\Operation\Aggregate`` object
and pass the database, collection, and pipeline stages as parameters. Then, pass the
``MongoDB\Operation\Aggregate`` object to the ``MongoDB\Collection::explain()`` method.

The following example instructs MongoDB to explain the aggregation operation
from the preceding :ref:`php-aggregation-example`:

.. io-code-block::
:copyable:

.. input:: /includes/aggregation.php
:start-after: start-explain
:end-before: end-explain
:language: php
:dedent:

.. output::
:visible: false

{"explainVersion":"2","queryPlanner":{"namespace":"sample_restaurants.restaurants",
"indexFilterSet":false,"parsedQuery":{"cuisine":{"$eq":"Bakery"}},"queryHash":"865F14C3",
"planCacheKey":"D56D6F10","optimizedPipeline":true,"maxIndexedOrSolutionsReached":false,
"maxIndexedAndSolutionsReached":false,"maxScansToExplodeReached":false,"winningPlan":{
... }

Additional Information
----------------------

To view a tutorial that uses the {+php-library+} to create complex aggregation
pipelines, see `Complex Aggregation Pipelines with Vanilla PHP and MongoDB

Check failure on line 168 in source/aggregation.txt

View workflow job for this annotation

GitHub Actions / TDBX Vale rules

[vale] reported by reviewdog 🐶 [MongoDB.GlobalAudienceMetaphorical] Use 'standard' instead of the metaphor 'Vanilla'. Raw Output: {"message": "[MongoDB.GlobalAudienceMetaphorical] Use 'standard' instead of the metaphor 'Vanilla'.", "location": {"path": "source/aggregation.txt", "range": {"start": {"line": 168, "column": 52}}}, "severity": "ERROR"}
<https://www.mongodb.com/developer/products/mongodb/aggregations-php-mongodb/>`__
in the MongoDB Developer Center.

MongoDB Server Manual
~~~~~~~~~~~~~~~~~~~~~

To learn more about the topics discussed in this guide, see the following
pages in the {+mdb-server+} manual:

- To view a full list of expression operators, see :manual:`Aggregation
Operators </reference/operator/aggregation/>`.

- To learn about assembling an aggregation pipeline and to view examples, see
:manual:`Aggregation Pipeline </core/aggregation-pipeline/>`.

- To learn more about creating pipeline stages, see :manual:`Aggregation
Stages </reference/operator/aggregation-pipeline/>`.

- To learn more about explaining MongoDB operations, see
:manual:`Explain Output </reference/explain-results/>` and
:manual:`Query Plans </core/query-plans/>`.

.. TODO:
Aggregation Tutorials
~~~~~~~~~~~~~~~~~~~~~

.. To view step-by-step explanations of common aggregation tasks, see
.. :ref:`php-aggregation-tutorials-landing`.

API Documentation
~~~~~~~~~~~~~~~~~

To learn more about the methods discussed in this guide, see the
following API documentation:

- `MongoDB\\Collection::aggregate() <{+api+}/method/MongoDBCollection-aggregate/>`__
- `MongoDB\\Collection::explain() <{+api+}/method/MongoDBCollection-explain/>`__
40 changes: 40 additions & 0 deletions source/includes/aggregation.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
<?php
require 'vendor/autoload.php';

$uri = getenv('MONGODB_URI') ?: throw new RuntimeException('Set the MONGODB_URI variable to your Atlas URI that connects to the sample dataset');
$client = new MongoDB\Client($uri);

$collection = $client->sample_restaurants->restaurants;

// Retrieves documents with a cuisine value of "Bakery", groups them by "borough", and
// counts each borough's matching documents
// start-match-group
$pipeline = [
['$match' => ['cuisine' => 'Bakery']],
['$group' => ['_id' => '$borough', 'count' => ['$sum' => 1]]],
];

$cursor = $collection->aggregate($pipeline);

foreach ($cursor as $doc) {
echo json_encode($doc) , PHP_EOL;
}
// end-match-group

// Performs the same aggregation operation as above but asks MongoDB to explain it
// start-explain
$pipeline = [
['$match' => ['cuisine' => 'Bakery']],
['$group' => ['_id' => '$borough', 'count' => ['$sum' => 1]]],
];

$aggregate = new MongoDB\Operation\Aggregate(
$collection->getDatabaseName(),
$collection->getCollectionName(),
$pipeline
);

$result = $collection->explain($aggregate);
echo json_encode($result) , PHP_EOL;
// end-explain

8 changes: 4 additions & 4 deletions source/includes/read/limit-skip-sort.php
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
);

foreach ($cursor as $doc) {
echo json_encode($doc) . PHP_EOL;
echo json_encode($doc) , PHP_EOL;
}
// end-limit

Expand All @@ -29,7 +29,7 @@
);

foreach ($cursor as $doc) {
echo json_encode($doc) . PHP_EOL;
echo json_encode($doc) , PHP_EOL;
}
// end-sort

Expand All @@ -41,7 +41,7 @@
);

foreach ($cursor as $doc) {
echo json_encode($doc) . PHP_EOL;
echo json_encode($doc) , PHP_EOL;
}
// end-skip

Expand All @@ -56,7 +56,7 @@

$cursor = $collection->find(['cuisine' => 'Italian'], $options);
foreach ($cursor as $doc) {
echo json_encode($doc) . PHP_EOL;
echo json_encode($doc) , PHP_EOL;
}
// end-limit-sort-skip

Expand Down
6 changes: 3 additions & 3 deletions source/includes/read/project.php
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

$cursor = $collection->find(['name' => 'Emerald Pub'], $options);
foreach ($cursor as $doc) {
echo json_encode($doc) . PHP_EOL;
echo json_encode($doc) , PHP_EOL;
}
// end-project-include

Expand All @@ -39,7 +39,7 @@

$cursor = $collection->find(['name' => 'Emerald Pub'], $options);
foreach ($cursor as $doc) {
echo json_encode($doc) . PHP_EOL;
echo json_encode($doc) , PHP_EOL;
}
// end-project-include-without-id

Expand All @@ -54,6 +54,6 @@

$cursor = $collection->find(['name' => 'Emerald Pub'], $options);
foreach ($cursor as $doc) {
echo json_encode($doc) . PHP_EOL;
echo json_encode($doc) , PHP_EOL;
}
// end-project-exclude
6 changes: 3 additions & 3 deletions source/includes/read/retrieve.php
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
// Finds one document with a "name" value of "LinkedIn"
// start-find-one
$document = $collection->findOne(['name' => 'LinkedIn']);
echo json_encode($document) . "\n";
echo json_encode($document) , "\n";
// end-find-one

// Finds documents with a "founded_year" value of 1970
Expand All @@ -22,7 +22,7 @@
// Prints documents with a "founded_year" value of 1970
// start-cursor
foreach ($results as $doc) {
echo json_encode($doc) . "\n";
echo json_encode($doc) , "\n";
}
// end-cursor

Expand All @@ -34,6 +34,6 @@
);

foreach ($results as $doc) {
echo json_encode($doc) . "\n";
echo json_encode($doc) , "\n";
}
// end-modify
14 changes: 7 additions & 7 deletions source/includes/read/specify-queries.php
Original file line number Diff line number Diff line change
Expand Up @@ -50,23 +50,23 @@
// start-find-exact
$cursor = $collection->find(['color' => 'yellow']);
foreach ($cursor as $doc) {
echo json_encode($doc) . PHP_EOL;
echo json_encode($doc) , PHP_EOL;
}
// end-find-exact

// Retrieves all documents in the collection
// start-find-all
$cursor = $collection->find([]);
foreach ($cursor as $doc) {
echo json_encode($doc) . PHP_EOL;
echo json_encode($doc) , PHP_EOL;
}
// end-find-all

// Retrieves and prints documents in which the "rating" value is greater than 2
// start-find-comparison
$cursor = $collection->find(['rating' => ['$gt' => 2]]);
foreach ($cursor as $doc) {
echo json_encode($doc) . PHP_EOL;
echo json_encode($doc) , PHP_EOL;
}
// end-find-comparison

Expand All @@ -79,30 +79,30 @@
]
]);
foreach ($cursor as $doc) {
echo json_encode($doc) . PHP_EOL;
echo json_encode($doc) , PHP_EOL;
}
// end-find-logical

// Retrieves and prints documents in which the "type" array has 2 elements
// start-find-array
$cursor = $collection->find(['type' => ['$size' => 2]]);
foreach ($cursor as $doc) {
echo json_encode($doc) . PHP_EOL;
echo json_encode($doc) , PHP_EOL;
}
// end-find-array

// Retrieves and prints documents that have a "color" field
// start-find-element
$cursor = $collection->find(['color' => ['$exists' => true]]);
foreach ($cursor as $doc) {
echo json_encode($doc) . PHP_EOL;
echo json_encode($doc) , PHP_EOL;
}
// end-find-element

// Retrieves and prints documents in which the "name" value has at least two consecutive "p" characters
// start-find-evaluation
$cursor = $collection->find(['name' => ['$regex' => 'p{2,}']]);
foreach ($cursor as $doc) {
echo json_encode($doc) . PHP_EOL;
echo json_encode($doc) , PHP_EOL;
}
// end-find-evaluation
Loading
Loading