我正在尝试从errorlog集合中查询所有数据，并在同一查询中获取每个错误日志条目的相关irs_documents计数 .

The problem is that there are too many records in the irs_documents collection to perform a $lookup.

是否有一种在一个MongoDB查询中执行此操作的高效方法？

尝试失败

db.getCollection('errorlog').aggregate(
  [
    {
        $lookup: {
          from: "irs_documents",
          localField: "document.ssn",
          foreignField: "ssn",
          as: "irs_documents"
        }
    },
    {
        $group: {
            _id: { document: "$document", error: "$error" },
            logged_documents: { $sum : 1 }
        }
    }
  ]
)

错误

Total size of documents in $lookup exceeds maximum document size

显然，这个解决方案不起作用 . MongoDB实际上是试图用$ lookup来收集整个文档，我只想要计数 .

“errorlog”集合示例数据：

/* 1 */
{
    "_id" : ObjectId("56d73955ce09a5a32399f022"),
    "document" : {
        "ssn" : 1
    },
    "error" : "Error 1"
}

/* 2 */
{
    "_id" : ObjectId("56d73967ce09a5a32399f023"),
    "document" : {
        "ssn" : 2
    },
    "error" : "Error 1"
}

/* 3 */
{
    "_id" : ObjectId("56d73979ce09a5a32399f024"),
    "document" : {
        "ssn" : 3
    },
    "error" : "Error 429"
}

/* 4 */
{
    "_id" : ObjectId("56d73985ce09a5a32399f025"),
    "document" : {
        "ssn" : 9
    },
    "error" : "Error 1"
}

/* 5 */
{
    "_id" : ObjectId("56d73990ce09a5a32399f026"),
    "document" : {
        "ssn" : 1
    },
    "error" : "Error 8"
}

“irs_documents”收集样本数据

/* 1 */
{
    "_id" : ObjectId("56d73905ce09a5a32399f01e"),
    "ssn" : 1,
    "name" : "Sally"
}

/* 2 */
{
    "_id" : ObjectId("56d7390fce09a5a32399f01f"),
    "ssn" : 2,
    "name" : "Bob"
}

/* 3 */
{
    "_id" : ObjectId("56d7391ace09a5a32399f020"),
    "ssn" : 3,
    "name" : "Kelly"
}

/* 4 */
{
    "_id" : ObjectId("56d7393ace09a5a32399f021"),
    "ssn" : 9,
    "name" : "Pippinpaddle-Oppsokopolis"
}

1 回答

1
错误是自我解释的 . Lookup实际上是将两个文档合并到单个BSON文档中，因此MongoDB文档大小限制正在咬你 .

你需要问自己，在一次操作中执行这两个动作是绝对必要的吗？如果是，请按照以前版本的MongoDB中的方式执行此操作，其中不支持$ lookup .

说，执行两个查询并在您的客户端执行合并 .

选项＃1：您可以在irs_documents上聚合并将计算结果导出到另一个集合中 . 因为，每个文档中的对象都很少，我认为你不会遇到问题 . 但是，您可能遇到内存问题并被迫使用磁盘进行聚合框架 . 尝试以下解决方案，看看它是否有效 .
```
db.irs_documents.aggregate([
{
  $group:{_id:"$ssn", count:{$sum:1}}  
},
{
  $out:"irs_documents_group"
}]);

db.errorlog.aggregate([
    {
        $lookup: {
          from: "irs_documents_group",
          localField: "document.ssn",
          foreignField: "ssn",
          as: "irs_documents"
        }
    },
    {
        $group: {
            _id: { document: "$document", error: "$error" },
            logged_documents: { $sum : 1 }
        }
    }
  ])
```
选项＃2：如果上面的解决方案不起作用，你总是可以使用map reduce，虽然它不是一个优雅的解决方案，但会起作用 .
回复于 2024-04-29T06:17:04+08:00

Mongodb聚合大型数据集的查询计数记录

尝试失败

错误

“errorlog”集合示例数据：

“irs_documents”收集样本数据

1 回答

相关问题