diff --git a/source/aggregation.txt b/source/aggregation.txt new file mode 100644 index 00000000..1a677c95 --- /dev/null +++ b/source/aggregation.txt @@ -0,0 +1,205 @@ +.. _php-aggregation: + +==================================== +Transform Your Data with Aggregation +==================================== + +.. facet:: + :name: genre + :values: reference + +.. meta:: + :keywords: code example, transform, computed, pipeline + :description: Learn how to use the PHP library to perform aggregation operations. + +.. contents:: On this page + :local: + :backlinks: none + :depth: 2 + :class: singlecol + +.. TODO: + .. toctree:: + :titlesonly: + :maxdepth: 1 + + /aggregation/aggregation-tutorials + +Overview +-------- + +In this guide, you can learn how to use the {+php-library+} to perform +**aggregation operations**. + +Aggregation operations process data in your MongoDB collections and +return computed results. The MongoDB Aggregation framework, which is +part of the Query API, is modeled on the concept of data processing +pipelines. Documents enter a pipeline that contains one or more stages, +and this pipeline transforms the documents into an aggregated result. + +An aggregation operation is similar to a car factory. A car factory has +an assembly line, which contains assembly stations with specialized +tools to do specific jobs, like drills and welders. Raw parts enter the +factory, and then the assembly line transforms and assembles them into a +finished product. + +The **aggregation pipeline** is the assembly line, **aggregation stages** are the +assembly stations, and **operator expressions** are the +specialized tools. + +Aggregation Versus Find Operations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +You can use find operations to perform the following actions: + +- Select which documents to return +- Select which fields to return +- Sort the results + +You can use aggregation operations to perform the following actions: + +- Run find operations +- Rename fields +- Calculate fields +- Summarize data +- Group values + +Limitations +~~~~~~~~~~~ + +Consider the following limitations when performing aggregation operations: + +- Returned documents cannot violate the + :manual:`BSON document size limit ` + of 16 megabytes. +- Pipeline stages have a memory limit of 100 megabytes by default. You can exceed this + limit by creating an options array that sets the ``allowDiskUse`` option to ``true`` + and passing the array to the ``MongoDB\Collection::aggregate()`` method. + + .. important:: $graphLookup Exception + + The :manual:`$graphLookup + ` stage has a strict + memory limit of 100 megabytes and ignores the ``allowDiskUse`` option. + +.. _php-aggregation-example: + +Aggregation Example +------------------- + +.. note:: + + The examples in this guide use the ``restaurants`` collection in the ``sample_restaurants`` + database from the :atlas:`Atlas sample datasets `. To learn how to create a + free MongoDB Atlas cluster and load the sample datasets, see the :atlas:`Get Started with Atlas + ` guide. + +To perform an aggregation, pass an array containing the pipeline stages to +the ``MongoDB\Collection::aggregate()`` method. + +The following code example produces a count of the number of bakeries in each borough +of New York. To do so, it uses an aggregation pipeline that contains the following stages: + +- :manual:`$match ` stage to filter for documents + in which the ``cuisine`` field contains the value ``'Bakery'`` + +- :manual:`$group ` stage to group the matching + documents by the ``borough`` field, accumulating a count of documents for each distinct + value + +.. io-code-block:: + :copyable: + + .. input:: /includes/aggregation.php + :start-after: start-match-group + :end-before: end-match-group + :language: php + :dedent: + + .. output:: + :visible: false + + {"_id":"Brooklyn","count":173} + {"_id":"Queens","count":204} + {"_id":"Bronx","count":71} + {"_id":"Staten Island","count":20} + {"_id":"Missing","count":2} + {"_id":"Manhattan","count":221} + +Explain an Aggregation +~~~~~~~~~~~~~~~~~~~~~~ + +To view information about how MongoDB executes your operation, you can +instruct the MongoDB query planner to **explain** it. When MongoDB explains +an operation, it returns **execution plans** and performance statistics. +An execution plan is a potential way in which MongoDB can complete an operation. +When you instruct MongoDB to explain an operation, it returns both the +plan MongoDB executed and any rejected execution plans. + +To explain an aggregation operation, construct a ``MongoDB\Operation\Aggregate`` object +and pass the database, collection, and pipeline stages as parameters. Then, pass the +``MongoDB\Operation\Aggregate`` object to the ``MongoDB\Collection::explain()`` method. + +The following example instructs MongoDB to explain the aggregation operation +from the preceding :ref:`php-aggregation-example`: + +.. io-code-block:: + :copyable: + + .. input:: /includes/aggregation.php + :start-after: start-explain + :end-before: end-explain + :language: php + :dedent: + + .. output:: + :visible: false + + {"explainVersion":"2","queryPlanner":{"namespace":"sample_restaurants.restaurants", + "indexFilterSet":false,"parsedQuery":{"cuisine":{"$eq":"Bakery"}},"queryHash":"865F14C3", + "planCacheKey":"D56D6F10","optimizedPipeline":true,"maxIndexedOrSolutionsReached":false, + "maxIndexedAndSolutionsReached":false,"maxScansToExplodeReached":false,"winningPlan":{ + ... } + +Additional Information +---------------------- + +To view a tutorial that uses the {+php-library+} to create complex aggregation +pipelines, see `Complex Aggregation Pipelines with Vanilla PHP and MongoDB +`__ +in the MongoDB Developer Center. + +MongoDB Server Manual +~~~~~~~~~~~~~~~~~~~~~ + +To learn more about the topics discussed in this guide, see the following +pages in the {+mdb-server+} manual: + +- To view a full list of expression operators, see :manual:`Aggregation + Operators `. + +- To learn about assembling an aggregation pipeline and to view examples, see + :manual:`Aggregation Pipeline `. + +- To learn more about creating pipeline stages, see :manual:`Aggregation + Stages `. + +- To learn more about explaining MongoDB operations, see + :manual:`Explain Output ` and + :manual:`Query Plans `. + +.. TODO: + Aggregation Tutorials + ~~~~~~~~~~~~~~~~~~~~~ + +.. To view step-by-step explanations of common aggregation tasks, see +.. :ref:`php-aggregation-tutorials-landing`. + +API Documentation +~~~~~~~~~~~~~~~~~ + +To learn more about the methods discussed in this guide, see the +following API documentation: + +- `MongoDB\\Collection::aggregate() <{+api+}/method/MongoDBCollection-aggregate/>`__ +- `MongoDB\\Collection::explain() <{+api+}/method/MongoDBCollection-explain/>`__ diff --git a/source/includes/aggregation.php b/source/includes/aggregation.php new file mode 100644 index 00000000..e2cfa914 --- /dev/null +++ b/source/includes/aggregation.php @@ -0,0 +1,40 @@ +sample_restaurants->restaurants; + +// Retrieves documents with a cuisine value of "Bakery", groups them by "borough", and +// counts each borough's matching documents +// start-match-group +$pipeline = [ + ['$match' => ['cuisine' => 'Bakery']], + ['$group' => ['_id' => '$borough', 'count' => ['$sum' => 1]]], +]; + +$cursor = $collection->aggregate($pipeline); + +foreach ($cursor as $doc) { + echo json_encode($doc) , PHP_EOL; +} +// end-match-group + +// Performs the same aggregation operation as above but asks MongoDB to explain it +// start-explain +$pipeline = [ + ['$match' => ['cuisine' => 'Bakery']], + ['$group' => ['_id' => '$borough', 'count' => ['$sum' => 1]]], +]; + +$aggregate = new MongoDB\Operation\Aggregate( + $collection->getDatabaseName(), + $collection->getCollectionName(), + $pipeline +); + +$result = $collection->explain($aggregate); +echo json_encode($result) , PHP_EOL; +// end-explain + diff --git a/source/includes/read/limit-skip-sort.php b/source/includes/read/limit-skip-sort.php index 3ab8a16f..8beec933 100644 --- a/source/includes/read/limit-skip-sort.php +++ b/source/includes/read/limit-skip-sort.php @@ -17,7 +17,7 @@ ); foreach ($cursor as $doc) { - echo json_encode($doc) . PHP_EOL; + echo json_encode($doc) , PHP_EOL; } // end-limit @@ -29,7 +29,7 @@ ); foreach ($cursor as $doc) { - echo json_encode($doc) . PHP_EOL; + echo json_encode($doc) , PHP_EOL; } // end-sort @@ -41,7 +41,7 @@ ); foreach ($cursor as $doc) { - echo json_encode($doc) . PHP_EOL; + echo json_encode($doc) , PHP_EOL; } // end-skip @@ -56,7 +56,7 @@ $cursor = $collection->find(['cuisine' => 'Italian'], $options); foreach ($cursor as $doc) { - echo json_encode($doc) . PHP_EOL; + echo json_encode($doc) , PHP_EOL; } // end-limit-sort-skip diff --git a/source/includes/read/project.php b/source/includes/read/project.php index 1e7c42f3..3f63d93b 100644 --- a/source/includes/read/project.php +++ b/source/includes/read/project.php @@ -21,7 +21,7 @@ $cursor = $collection->find(['name' => 'Emerald Pub'], $options); foreach ($cursor as $doc) { - echo json_encode($doc) . PHP_EOL; + echo json_encode($doc) , PHP_EOL; } // end-project-include @@ -39,7 +39,7 @@ $cursor = $collection->find(['name' => 'Emerald Pub'], $options); foreach ($cursor as $doc) { - echo json_encode($doc) . PHP_EOL; + echo json_encode($doc) , PHP_EOL; } // end-project-include-without-id @@ -54,6 +54,6 @@ $cursor = $collection->find(['name' => 'Emerald Pub'], $options); foreach ($cursor as $doc) { - echo json_encode($doc) . PHP_EOL; + echo json_encode($doc) , PHP_EOL; } // end-project-exclude diff --git a/source/includes/read/retrieve.php b/source/includes/read/retrieve.php index fbd64a0d..ed086984 100644 --- a/source/includes/read/retrieve.php +++ b/source/includes/read/retrieve.php @@ -11,7 +11,7 @@ // Finds one document with a "name" value of "LinkedIn" // start-find-one $document = $collection->findOne(['name' => 'LinkedIn']); -echo json_encode($document) . "\n"; +echo json_encode($document) , "\n"; // end-find-one // Finds documents with a "founded_year" value of 1970 @@ -22,7 +22,7 @@ // Prints documents with a "founded_year" value of 1970 // start-cursor foreach ($results as $doc) { - echo json_encode($doc) . "\n"; + echo json_encode($doc) , "\n"; } // end-cursor @@ -34,6 +34,6 @@ ); foreach ($results as $doc) { - echo json_encode($doc) . "\n"; + echo json_encode($doc) , "\n"; } // end-modify diff --git a/source/includes/read/specify-queries.php b/source/includes/read/specify-queries.php index 3e3c07cb..df7e1c11 100644 --- a/source/includes/read/specify-queries.php +++ b/source/includes/read/specify-queries.php @@ -50,7 +50,7 @@ // start-find-exact $cursor = $collection->find(['color' => 'yellow']); foreach ($cursor as $doc) { - echo json_encode($doc) . PHP_EOL; + echo json_encode($doc) , PHP_EOL; } // end-find-exact @@ -58,7 +58,7 @@ // start-find-all $cursor = $collection->find([]); foreach ($cursor as $doc) { - echo json_encode($doc) . PHP_EOL; + echo json_encode($doc) , PHP_EOL; } // end-find-all @@ -66,7 +66,7 @@ // start-find-comparison $cursor = $collection->find(['rating' => ['$gt' => 2]]); foreach ($cursor as $doc) { - echo json_encode($doc) . PHP_EOL; + echo json_encode($doc) , PHP_EOL; } // end-find-comparison @@ -79,7 +79,7 @@ ] ]); foreach ($cursor as $doc) { - echo json_encode($doc) . PHP_EOL; + echo json_encode($doc) , PHP_EOL; } // end-find-logical @@ -87,7 +87,7 @@ // start-find-array $cursor = $collection->find(['type' => ['$size' => 2]]); foreach ($cursor as $doc) { - echo json_encode($doc) . PHP_EOL; + echo json_encode($doc) , PHP_EOL; } // end-find-array @@ -95,7 +95,7 @@ // start-find-element $cursor = $collection->find(['color' => ['$exists' => true]]); foreach ($cursor as $doc) { - echo json_encode($doc) . PHP_EOL; + echo json_encode($doc) , PHP_EOL; } // end-find-element @@ -103,6 +103,6 @@ // start-find-evaluation $cursor = $collection->find(['name' => ['$regex' => 'p{2,}']]); foreach ($cursor as $doc) { - echo json_encode($doc) . PHP_EOL; + echo json_encode($doc) , PHP_EOL; } // end-find-evaluation diff --git a/source/index.txt b/source/index.txt index 7805e642..72d8b53b 100644 --- a/source/index.txt +++ b/source/index.txt @@ -13,6 +13,7 @@ MongoDB PHP Library Get Started /read /write + /aggregation /tutorial /upgrade /reference