-
Notifications
You must be signed in to change notification settings - Fork 34
DOCSP-41988: Aggregation #115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 4 commits
baa60ec
6f5c0a4
c6c03cb
4fcfaf5
6a79231
6132ff6
967767e
d29b6c5
58ec252
62167a1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,197 @@ | ||||||||||||||
.. _php-aggregation: | ||||||||||||||
|
||||||||||||||
==================================== | ||||||||||||||
Transform Your Data with Aggregation | ||||||||||||||
==================================== | ||||||||||||||
|
||||||||||||||
.. facet:: | ||||||||||||||
:name: genre | ||||||||||||||
:values: reference | ||||||||||||||
|
||||||||||||||
.. meta:: | ||||||||||||||
:keywords: code example, transform, computed, pipeline | ||||||||||||||
:description: Learn how to use the PHP library to perform aggregation operations. | ||||||||||||||
|
||||||||||||||
.. contents:: On this page | ||||||||||||||
:local: | ||||||||||||||
:backlinks: none | ||||||||||||||
:depth: 2 | ||||||||||||||
:class: singlecol | ||||||||||||||
|
||||||||||||||
.. TODO: | ||||||||||||||
.. toctree:: | ||||||||||||||
:titlesonly: | ||||||||||||||
:maxdepth: 1 | ||||||||||||||
|
||||||||||||||
/aggregation/aggregation-tutorials | ||||||||||||||
|
||||||||||||||
Overview | ||||||||||||||
-------- | ||||||||||||||
|
||||||||||||||
In this guide, you can learn how to use the {+php-library+} to perform | ||||||||||||||
**aggregation operations**. | ||||||||||||||
|
||||||||||||||
Aggregation operations process data in your MongoDB collections and | ||||||||||||||
return computed results. The MongoDB Aggregation framework, which is | ||||||||||||||
part of the Query API, is modeled on the concept of data processing | ||||||||||||||
pipelines. Documents enter a pipeline that contains one or more stages, | ||||||||||||||
and this pipeline transforms the documents into an aggregated result. | ||||||||||||||
|
||||||||||||||
An aggregation operation is similar to a car factory. A car factory has | ||||||||||||||
an assembly line, which contains assembly stations with specialized | ||||||||||||||
tools to do specific jobs, like drills and welders. Raw parts enter the | ||||||||||||||
factory, and then the assembly line transforms and assembles them into a | ||||||||||||||
finished product. | ||||||||||||||
|
||||||||||||||
The **aggregation pipeline** is the assembly line, **aggregation stages** are the | ||||||||||||||
assembly stations, and **operator expressions** are the | ||||||||||||||
specialized tools. | ||||||||||||||
|
||||||||||||||
Aggregation Versus Find Operations | ||||||||||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||||||||||||||
|
||||||||||||||
You can use find operations to perform the following actions: | ||||||||||||||
|
||||||||||||||
- Select which documents to return | ||||||||||||||
- Select which fields to return | ||||||||||||||
- Sort the results | ||||||||||||||
|
||||||||||||||
You can use aggregation operations to perform the following actions: | ||||||||||||||
|
||||||||||||||
- Run find operations | ||||||||||||||
- Rename fields | ||||||||||||||
- Calculate fields | ||||||||||||||
- Summarize data | ||||||||||||||
- Group values | ||||||||||||||
|
||||||||||||||
Limitations | ||||||||||||||
~~~~~~~~~~~ | ||||||||||||||
|
||||||||||||||
Keep the following limitations in mind when using aggregation operations: | ||||||||||||||
|
||||||||||||||
- Returned documents cannot violate the | ||||||||||||||
:manual:`BSON document size limit </reference/limits/#mongodb-limit-BSON-Document-Size>` | ||||||||||||||
of 16 megabytes. | ||||||||||||||
- Pipeline stages have a memory limit of 100 megabytes by default. You can exceed this | ||||||||||||||
limit by creating an options array that sets the ``allowDiskUse`` option to ``true`` | ||||||||||||||
and passing the array to the ``MongoDB\Collection::aggregate()`` method. | ||||||||||||||
|
||||||||||||||
.. important:: $graphLookup Exception | ||||||||||||||
|
||||||||||||||
The :manual:`$graphLookup | ||||||||||||||
</reference/operator/aggregation/graphLookup/>` stage has a strict | ||||||||||||||
memory limit of 100 megabytes and ignores the ``allowDiskUse`` option. | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this admonition should be indented since it applies to the second bullet above |
||||||||||||||
|
||||||||||||||
.. _php-aggregation-example: | ||||||||||||||
|
||||||||||||||
Aggregation Example | ||||||||||||||
------------------- | ||||||||||||||
|
||||||||||||||
.. note:: | ||||||||||||||
|
||||||||||||||
The examples in this guide use the ``restaurants`` collection in the ``sample_restaurants`` | ||||||||||||||
database from the :atlas:`Atlas sample datasets </sample-data>`. To learn how to create a | ||||||||||||||
free MongoDB Atlas cluster and load the sample datasets, see the :atlas:`Get Started with Atlas | ||||||||||||||
</getting-started>` guide. | ||||||||||||||
|
||||||||||||||
To perform an aggregation, pass an array containing the aggregation pipeline | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could "aggregation pipeline stages" be simplified as "pipeline stages"? I think the reader already has the context and we could avoid word repetition. |
||||||||||||||
stages to the ``MongoDB\Collection::aggregate()`` method. | ||||||||||||||
|
||||||||||||||
The following code example produces a count of the number of bakeries in each borough | ||||||||||||||
of New York. To do so, it uses an aggregation pipeline that contains the following stages: | ||||||||||||||
|
||||||||||||||
- :manual:`$match </reference/operator/aggregation/match/>` stage to filter for documents | ||||||||||||||
in which the ``cuisine`` field contains the value ``'Bakery'`` | ||||||||||||||
|
||||||||||||||
- :manual:`$group </reference/operator/aggregation/group/>` stage to group the matching | ||||||||||||||
documents by the ``borough`` field, accumulating a count of documents for each distinct | ||||||||||||||
value | ||||||||||||||
|
||||||||||||||
.. io-code-block:: | ||||||||||||||
:copyable: | ||||||||||||||
|
||||||||||||||
.. input:: /includes/aggregation.php | ||||||||||||||
:start-after: start-match-group | ||||||||||||||
:end-before: end-match-group | ||||||||||||||
:language: php | ||||||||||||||
:dedent: | ||||||||||||||
|
||||||||||||||
.. output:: | ||||||||||||||
:visible: false | ||||||||||||||
|
||||||||||||||
{"_id":"Brooklyn","count":173} | ||||||||||||||
{"_id":"Queens","count":204} | ||||||||||||||
{"_id":"Bronx","count":71} | ||||||||||||||
{"_id":"Staten Island","count":20} | ||||||||||||||
{"_id":"Missing","count":2} | ||||||||||||||
{"_id":"Manhattan","count":221} | ||||||||||||||
|
||||||||||||||
Explain an Aggregation | ||||||||||||||
norareidy marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||
~~~~~~~~~~~~~~~~~~~~~~ | ||||||||||||||
|
||||||||||||||
To view information about how MongoDB executes your operation, you can | ||||||||||||||
instruct the MongoDB query planner to **explain** it. When MongoDB explains | ||||||||||||||
an operation, it returns **execution plans** and performance statistics. | ||||||||||||||
An execution plan is a potential way MongoDB can complete an operation. | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
When you instruct MongoDB to explain an operation, it returns both the | ||||||||||||||
plan MongoDB executed and any rejected execution plans. | ||||||||||||||
|
||||||||||||||
To explain an aggregation operation, run the ``explain`` database command by passing | ||||||||||||||
the command information to the ``MongoDB\Database::command()`` method. You must set the | ||||||||||||||
``aggregate``, ``pipeline``, and ``cursor`` fields of the ``explain`` command document | ||||||||||||||
to explain the aggregation. | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. S: The word "set" here made me think you were going to explain what these fields needed to be set to
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note: PHPLIB actually has an I think it'd be preferable for the example to demonstrate that, and it'd also provide an opportunity to link to the API reference for that method (like you do for Users will need to construct a |
||||||||||||||
|
||||||||||||||
The following example instructs MongoDB to explain the aggregation operation from the | ||||||||||||||
preceding :ref:`php-aggregation-example`: | ||||||||||||||
|
||||||||||||||
.. io-code-block:: | ||||||||||||||
:copyable: | ||||||||||||||
|
||||||||||||||
.. input:: /includes/aggregation.php | ||||||||||||||
:start-after: start-explain | ||||||||||||||
:end-before: end-explain | ||||||||||||||
:language: php | ||||||||||||||
:dedent: | ||||||||||||||
|
||||||||||||||
.. output:: | ||||||||||||||
:visible: false | ||||||||||||||
|
||||||||||||||
{"explainVersion":"2","queryPlanner":{"namespace":"sample_restaurants.restaurants", | ||||||||||||||
"indexFilterSet":false,"parsedQuery":{"cuisine":{"$eq":"Bakery"}},"queryHash":"865F14C3", | ||||||||||||||
"planCacheKey":"D56D6F10","optimizedPipeline":true,"maxIndexedOrSolutionsReached":false, | ||||||||||||||
"maxIndexedAndSolutionsReached":false,"maxScansToExplodeReached":false,"winningPlan":{ | ||||||||||||||
... } | ||||||||||||||
|
||||||||||||||
|
||||||||||||||
Additional Information | ||||||||||||||
---------------------- | ||||||||||||||
|
||||||||||||||
MongoDB Server Manual | ||||||||||||||
~~~~~~~~~~~~~~~~~~~~~ | ||||||||||||||
norareidy marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||
|
||||||||||||||
To view a full list of expression operators, see :manual:`Aggregation | ||||||||||||||
Operators. </reference/operator/aggregation/>` | ||||||||||||||
|
||||||||||||||
To learn about assembling an aggregation pipeline and view examples, see | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
:manual:`Aggregation Pipeline. </core/aggregation-pipeline/>` | ||||||||||||||
|
||||||||||||||
To learn more about creating pipeline stages, see :manual:`Aggregation | ||||||||||||||
Stages. </reference/operator/aggregation-pipeline/>` | ||||||||||||||
|
||||||||||||||
To learn more about explaining MongoDB operations, see | ||||||||||||||
:manual:`Explain Output </reference/explain-results/>` and | ||||||||||||||
:manual:`Query Plans. </core/query-plans/>` | ||||||||||||||
|
||||||||||||||
.. TODO: | ||||||||||||||
jmikola marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||
Aggregation Tutorials | ||||||||||||||
~~~~~~~~~~~~~~~~~~~~~ | ||||||||||||||
|
||||||||||||||
.. To view step-by-step explanations of common aggregation tasks, see | ||||||||||||||
.. :ref:`php-aggregation-tutorials-landing`. | ||||||||||||||
|
||||||||||||||
API Documentation | ||||||||||||||
~~~~~~~~~~~~~~~~~ | ||||||||||||||
|
||||||||||||||
For more information about executing aggregation operations by using the {+php-library+}, | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would "with the" read better than "by using the"? There's already an active verb with "executing". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have style guidelines about with / by using: https://www.mongodb.com/docs/meta/style-guide/terminology/alphabetical-terms/#std-term-using |
||||||||||||||
see `MongoDB\\Collection::aggregate() <{+api+}/method/MongoDBCollection-aggregate/>`__ in | ||||||||||||||
the API documentation. |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,43 @@ | ||||||
<?php | ||||||
require 'vendor/autoload.php'; | ||||||
|
||||||
$uri = getenv('MONGODB_URI') ?: throw new RuntimeException('Set the MONGODB_URI variable to your Atlas URI that connects to the sample dataset'); | ||||||
$client = new MongoDB\Client($uri); | ||||||
|
||||||
$db = $client->sample_restaurants; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you end up using |
||||||
$collection = $db->restaurants; | ||||||
|
||||||
// Retrieves documents with a cuisine value of "Bakery", groups them by "borough", and | ||||||
// counts each borough's matching documents | ||||||
// start-match-group | ||||||
$pipeline = [ | ||||||
['$match' => ['cuisine' => 'Bakery']], | ||||||
['$group' => ['_id' => '$borough', 'count' => ['$sum' => 1]]] | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Note: it's typically good practice to use trailing commas after all array elements, even though it's not required for the final element. I believe we actually enforce this in the PHPLIB coding standard, as it leads to more concise diffs when changing arrays. This can apply to the |
||||||
]; | ||||||
|
||||||
$cursor = $collection->aggregate($pipeline); | ||||||
|
||||||
foreach ($cursor as $doc) { | ||||||
echo json_encode($doc) . PHP_EOL; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
This avoids string concatenation and would be slightly more efficient. Also applies to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll change to a comma here and add that to the cleanup tasks |
||||||
} | ||||||
// end-match-group | ||||||
|
||||||
// Performs the same aggregation operation as above but asks MongoDB to explain it | ||||||
// start-explain | ||||||
$pipeline = [ | ||||||
['$match' => ['cuisine' => 'Bakery']], | ||||||
['$group' => ['_id' => '$borough', 'count' => ['$sum' => 1]]] | ||||||
]; | ||||||
|
||||||
$command = [ | ||||||
'explain' => [ | ||||||
'aggregate' => 'restaurants', | ||||||
'pipeline' => $pipeline, | ||||||
'cursor' => new stdClass() | ||||||
] | ||||||
]; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Consider the following to utilize $aggregate = new MongoDB\Operation\Aggregate(
$collection->getDatabaseName(),
$collection->getCollectionName(),
$pipeline
);
$result = $collection->explain($aggregate); Note that |
||||||
|
||||||
$result = $db->command($command)->toArray(); | ||||||
echo json_encode($result[0]) . PHP_EOL; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In contrast to #113, I think you're fine to use |
||||||
// end-explain | ||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,6 +12,7 @@ MongoDB PHP Library | |
|
||
Get Started </get-started> | ||
/read | ||
/aggregation | ||
/tutorial | ||
/upgrade | ||
/reference | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: