3.22. MongoDB aggregation

发布时间 :2025-10-25 12:32:57 UTC      

Aggregate in MongoDB is mainly used to process data (such as statistical average, summation, etc.) and return the calculated data results.

It’s kind of similar. SQL Count (*) in the statement.

3.22.1. aggregate() Method

The aggregate method in MongoDB uses the aggregate() .

Grammar

aggregate() The basic syntax format of the method is as follows:

>db.COLLECTION_NAME.aggregate(AGGREGATE_OPERATION)

Example

The data in the collection is as follows:

{
   _id: ObjectId(7df78ad8902c)
   title: 'MongoDB Overview',
   description: 'MongoDB is no sql database',
   by_user: 'runoob.com',
   url: 'http://www.runoob.com',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 100
},
{
   _id: ObjectId(7df78ad8902d)
   title: 'NoSQL Overview',
   description: 'No sql database is very fast',
   by_user: 'runoob.com',
   url: 'http://www.runoob.com',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 10
},
{
   _id: ObjectId(7df78ad8902e)
   title: 'Neo4j Overview',
   description: 'Neo4j is no sql database',
   by_user: 'Neo4j',
   url: 'http://www.neo4j.com',
   tags: ['neo4j', 'database', 'NoSQL'],
   likes: 750
},

Now we use the above collection to calculate the number of articles written by each author, and the result using aggregate () is as follows:

> db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : 1}}}])
{
   "result" : [
      {
         "_id" : "runoob.com",
         "num_tutorial" : 2
      },
      {
         "_id" : "Neo4j",
         "num_tutorial" : 1
      }
   ],
   "ok" : 1
}
>

The above example is similar to the sql statement:

select by_user, count(*) from mycol group by by_user

In the above example, we group the data by the field by_user field and calculate the sum of the same values in the by_user field.

The following table shows some aggregate expressions:

Expression.

Description

Example

$sum

Calculate the sum.

Db.mycol.aggregate ( [{$group : {_id : “$by_user”, num_tutorial : {$sum : “$likes”}}}] )

$avg

Calculate the average

Db.mycol.aggregate ( [{$group : {_id : “$by_user”, num_tutorial : {$avg : “$likes”}}}] )

$min

Gets the minimum value for all documents in the collection.

Db.mycol.aggregate ( [{$group : {_id : “$by_user”, num_tutorial : {$min : “$likes”}}}] )

$max

Gets the maximum value for all documents in the collection.

Db.mycol.aggregate ( [{$group : {_id : “$by_user”, num_tutorial : {$max : “$likes”}}}] )

$push

Adding values to an array does not determine whether there are duplicate values.

Db.mycol.aggregate ( [{$group : {_id : “$by_user”, url : {$push: “$url”}}}] )

$addToSet

Adding a value to an array determines whether there is a duplicate value, but does not add the same value if it already exists in the array.

Db.mycol.aggregate ( [{$group : {_id : “$by_user”, url : {$addToSet : “$url”}}}] )

$first

The first document data is obtained according to the sorting of resource documents.

Db.mycol.aggregate ( [{$group : {_id : “$by_user”, first_url : {$first : “$url”}}}] )

$last

Get the last document data according to the sorting of resource documents

Db.mycol.aggregate ( [{$group : {_id : “$by_user”, last_url : {$last : “$url”}}}] )

3.22.2. The concept of pipeline

Pipes are commonly used in Unix and Linux to use the output of the current command as a parameter to the next command.

MongoDB’s aggregation pipeline passes the results of the MongoDB document to the next after one pipe has finished processing. Pipe operations can be repeated.

Expression: processes the input document and outputs it. Expressions are stateless and can only be used to evaluate documents for the current aggregation pipeline, not other documents.

Here we introduce several operations commonly used in the aggregation framework:

  • $project: modify the structure of the input document. Can be used to rename, add, or delete fields, or to create calculation results and nested documents.

  • Match: used to filter data and output only documents that meet the criteria. Match uses MongoDB’s standard query operation.

  • Limit: used to limit the number of documents returned by the MongoDB aggregation pipeline.

  • Skip: skips the specified number of documents in the aggregation pipeline and returns the remaining documents.

  • Unwind: splits an array type field in a document into multiple strips, each containing a value in the array.

  • $group: groups the documents in the collection and can be used to count the results.

  • $sort: sort the input documents and output them.

  • $geoNear: outputs ordered documents close to a geographic location.

Pipe operator instance

1、$project实例

db.article.aggregate(
    { $project : {
        title : 1 ,
        author : 1 ,
    }}
 );

In this way, there are only three fields,_ id,tilte and author, in the result. By default, the_ id field is included. If you want not to include_ id, you can do this:

db.article.aggregate(
    { $project : {
        _id : 0 ,
        title : 1 ,
        author : 1
    }});

2.$match实例

db.articles.aggregate( [
                        { $match : { score : { $gt : 70, $lte : 90 } } },
                        { $group: { _id: null, count: { $sum: 1 } } }
                       ] );

$match is used to get records with scores greater than 70, less than or equal to 90, and then send the eligible records to the next stage $group pipeline operator for processing.

3.$skip实例

db.article.aggregate(
    { $skip : 5 });

After being processed by the $skip pipeline operator, the first five documents are “filtered” out.

Principles, Technologies, and Methods of Geographic Information Systems  102

In recent years, Geographic Information Systems (GIS) have undergone rapid development in both theoretical and practical dimensions. GIS has been widely applied for modeling and decision-making support across various fields such as urban management, regional planning, and environmental remediation, establishing geographic information as a vital component of the information era. The introduction of the “Digital Earth” concept has further accelerated the advancement of GIS, which serves as its technical foundation. Concurrently, scholars have been dedicated to theoretical research in areas like spatial cognition, spatial data uncertainty, and the formalization of spatial relationships. This reflects the dual nature of GIS as both an applied technology and an academic discipline, with the two aspects forming a mutually reinforcing cycle of progress.