Examples of MongoDB use of MapReduce in study notes

  • 2020-05-17 06:51:56
  • OfStack

1. mapreduce is grouped according to the first argument of the emit function called in the map function

Map-Reduce is a computing model that simply takes a large amount of work (data), breaks it down (MAP), and then merges the results into the final result (REDUCE).

Two functions Map and Reduce are implemented using MapReduce. Map calls emit(key, value), traverses all records in collection, and passes key and value to Reduce for processing. The Map function must call emit(key, value) to return a key-value pair.

Parameter description:
1. map: mapping function (generates a sequence of key-value pairs as parameters of the reduce function).
2. reduce statistics function, reduce function's task is to change key-values into key-value, which is to change values array into a single 1 value value.
3. out statistics store collections (if not specified, use temporary collections, which are automatically deleted when the client is disconnected).
4. query 1 filter criteria, only documents that meet the criteria will call the map function. (query. limit, sort can be combined at will)
5. sort and limit combined sort sort parameters (also sort documents before sending them to map function), which can optimize the grouping mechanism
6. The maximum number of documents limit can send to the map function (without limit, sort alone is useless)


// Test data preparation 
db.user.drop();

for(var i=10; i< 100; i++) {
  db.user.insert({
    name:"user" + i, 
    age : Math.floor(Math.random()*10)+ 20, 
    sex : Math.floor(Math.random()*3)%2 ==0 ? 'M' : 'F',
    chinese : Math.floor(Math.random()*50)+50,
    math : Math.floor(Math.random()*50)+50,
    english : Math.floor(Math.random()*50)+50,
    class : "C" + i%5
  })
}


// runCommand Operation mode 
db.sales.runCommand({
  mapreduce: "user",

  map: function(){
    if(this.class == "C1") {
      emit(this.age, this.age);
    }
  },

  reduce: function(key,values){
    var maxValue = Max(key, values);
    return maxValue;
  },

  {
    out: {inline: 1},
    query : "",
    sort: "",
    limit: "",
  }
})


db.user.mapReduce(
  //  The mapping function, which is called emit(key,value) , the set will be as you specify key Group mappings. 
  function(){
    //  In accordance with the emit The function of the first 1 Three parameters are grouped 
    //  The first 2 The value of the parameter will be passed to reduce
    emit(this.age, this);  
  },

  //  If I simplify the function, that's right map Grouping the data after grouping is simplified 
  //  in reduce(key,value) In the key is emit In the key, vlaues for emit After the grouping emit(value) A collection of 
  function(key, values){
    var maxValue = Math.max(key, values);
    return maxValue;
  },

  //  Optional parameters 
  {
    query: {sex: "F"},
    out: "result",
    sort : {},
    limit : 0
  }
)

Execution results:


{
  "result" : "result", //  The name of the collection to be stored 
  "timeMillis" : 23,
  "counts" : {
    "input" : 29, //  Number of documents passed in 
    "emit" : 29,  //  The number of times this function is called 
    "reduce" : 6, //  The number of times this function is called 
    "output" : 8  //  Returns the number of documents 
  },
  "ok" : 1
}

See the results returned:


db.result.find()

Related articles: