首页 文章

Mongodb聚合框架|分组多个值?

提问于
浏览
26

我想使用mongoDB的聚合框架来运行SQL中看起来有点像:

SELECT SUM(A), B, C from myTable GROUP BY B, C;

文档说明:

您可以从管道中的文档指定单个字段,先前计算的值或由多个传入字段组成的聚合键 .

但目前还不清楚“几个传入领域的聚合密钥”实际上是什么?

我的数据集有点像这样:

[{ "timeStamp" : 1341834988666, "label" : "sharon", "responseCode" : "200", "value" : 10, "success" : "true"},
{ "timeStamp" : 1341834988676, "label" : "paul", "responseCode" : "200", "value" : 60, "success" : "true"},
{ "timeStamp" : 1341834988686, "label" : "paul", "responseCode" : "404", "value" : 15, "success" : "true"},
{ "timeStamp" : 1341834988696, "label" : "sharon", "responseCode" : "200", "value" : 35, "success" : "false"},
{ "timeStamp" : 1341834988166, "label" : "paul", "responseCode" : "200", "value" : 40, "success" : "true"},
{ "timeStamp" : 1341834988266, "label" : "paul", "responseCode" : "404", "value" : 99, "success" : "false"}]

我的查询如下所示:

resultsCollection.aggregate(
    { $match : { testid : testid} },
    { $skip : alreadyRead },
    { $project : {
            timeStamp : 1 ,
            label : 1,
            responseCode : 1 ,
            value : 1,
            success : 1
        }},
    { $group : {
            _id : "$label",
            max_timeStamp : { $timeStamp : 1 },
            count_responseCode : { $sum : 1 },
            avg_value : { $sum : "$value" },
            count_success : { $sum : 1 }
        }},
    { $group : {
            ?
        }}
);

我的直觉是尝试将结果传递给第二组,我知道你可以做到这一点,但它不会起作用,因为第一组已经过多地减少了数据集,并且所需的细节水平也丢失了 .

我想要做的是使用 labelresponseCodesuccess 进行分组,并从结果中获取值的总和 . 看起来应该有点像:

label   | code | success | sum_of_values | count
sharon  | 200  |  true   |      10       |   1
sharon  | 200  |  false  |      35       |   1
paul    | 200  |  true   |      100      |   2
paul    | 404  |  true   |      15       |   1
paul    | 404  |  false  |      99       |   1

哪里有五组:

1. { "timeStamp" : 1341834988666, "label" : "sharon", "responseCode" : "200", "value" : 10, "success" : "true"}

2. { "timeStamp" : 1341834988696, "label" : "sharon", "responseCode" : "200", "value" : 35, "success" : "false"}

3. { "timeStamp" : 1341834988676, "label" : "paul", "responseCode" : "200", "value" : 60, "success" : "true"}
   { "timeStamp" : 1341834988166, "label" : "paul", "responseCode" : "200", "value" : 40, "success" : "true"}

4. { "timeStamp" : 1341834988686, "label" : "paul", "responseCode" : "404", "value" : 15, "success" : "true"}

5. { "timeStamp" : 1341834988266, "label" : "paul", "responseCode" : "404", "value" : 99, "success" : "false"}

1 回答

  • 37

    好的,所以解决方案是为_id值指定聚合键 . 这记录在here中:

    您可以从管道中的文档指定单个字段,先前计算的值或由多个传入字段组成的聚合键 .

    但它实际上并没有定义聚合键的格式 . 阅读前面的文档here我看到以前的collection.group方法可以采用多个字段,并且在新框架中使用相同的结构 .

    因此,要分组多个字段,您可以使用 _id : { success:'$success', responseCode:'$responseCode', label:'$label'}

    如:

    resultsCollection.aggregate(
    { $match : { testid : testid} },
    { $skip : alreadyRead },
    { $project : {
            timeStamp : 1 ,
            label : 1,
            responseCode : 1 ,
            value : 1,
            success : 1
        }},
    { $group : {
            _id :  { success:'$success', responseCode:'$responseCode', label:'$label'},
            max_timeStamp : { $timeStamp : 1 },
            count_responseCode : { $sum : 1 },
            avg_value : { $sum : "$value" },
            count_success : { $sum : 1 }
        }}
    );
    

相关问题