如何在MongoDB中执行SQL Join等效操作？-Java 学习之路

421

如何在MongoDB中执行SQL Join等效操作？

例如，假设你有两个集合（用户和评论），我想用pid = 444以及每个集合的用户信息来提取所有评论 .

comments
  { uid:12345, pid:444, comment="blah" }
  { uid:12345, pid:888, comment="asdf" }
  { uid:99999, pid:444, comment="qwer" }

users
  { uid:12345, name:"john" }
  { uid:99999, name:"mia"  }

有没有办法用一个字段拉出所有评论（例如......查找（{pid：444}））和一次性与每个评论相关的用户信息？

目前，我首先得到符合我标准的评论，然后找出该结果集中的所有uid，获取用户对象，并将它们与评论的结果合并 . 好像我做错了 .

19 回答

我们可以使用mongodb客户端控制台将只有一行的简单函数合并/加入一个集合中的所有数据，现在我们可以执行所需的查询 . 下面是一个完整的例子

.-作者：

db.authors.insert([
    {
        _id: 'a1',
        name: { first: 'orlando', last: 'becerra' },
        age: 27
    },
    {
        _id: 'a2',
        name: { first: 'mayra', last: 'sanchez' },
        age: 21
    }
]);

.-分类：

db.categories.insert([
    {
        _id: 'c1',
        name: 'sci-fi'
    },
    {
        _id: 'c2',
        name: 'romance'
    }
]);

.-书籍

db.books.insert([
    {
        _id: 'b1',
        name: 'Groovy Book',
        category: 'c1',
        authors: ['a1']
    },
    {
        _id: 'b2',
        name: 'Java Book',
        category: 'c2',
        authors: ['a1','a2']
    },
]);

.-图书借阅

db.lendings.insert([
    {
        _id: 'l1',
        book: 'b1',
        date: new Date('01/01/11'),
        lendingBy: 'jose'
    },
    {
        _id: 'l2',
        book: 'b1',
        date: new Date('02/02/12'),
        lendingBy: 'maria'
    }
]);

. - 魔法：

db.books.find().forEach(
    function (newBook) {
        newBook.category = db.categories.findOne( { "_id": newBook.category } );
        newBook.lendings = db.lendings.find( { "book": newBook._id  } ).toArray();
        newBook.authors = db.authors.find( { "_id": { $in: newBook.authors }  } ).toArray();
        db.booksReloaded.insert(newBook);
    }
);

.-获取新的收集数据：

db.booksReloaded.find().pretty()

.-响应:)

{
    "_id" : "b1",
    "name" : "Groovy Book",
    "category" : {
        "_id" : "c1",
        "name" : "sci-fi"
    },
    "authors" : [
        {
            "_id" : "a1",
            "name" : {
                "first" : "orlando",
                "last" : "becerra"
            },
            "age" : 27
        }
    ],
    "lendings" : [
        {
            "_id" : "l1",
            "book" : "b1",
            "date" : ISODate("2011-01-01T00:00:00Z"),
            "lendingBy" : "jose"
        },
        {
            "_id" : "l2",
            "book" : "b1",
            "date" : ISODate("2012-02-02T00:00:00Z"),
            "lendingBy" : "maria"
        }
    ]
}
{
    "_id" : "b2",
    "name" : "Java Book",
    "category" : {
        "_id" : "c2",
        "name" : "romance"
    },
    "authors" : [
        {
            "_id" : "a1",
            "name" : {
                "first" : "orlando",
                "last" : "becerra"
            },
            "age" : 27
        },
        {
            "_id" : "a2",
            "name" : {
                "first" : "mayra",
                "last" : "sanchez"
            },
            "age" : 21
        }
    ],
    "lendings" : [ ]
}

我希望这条线可以帮到你 .

回复于 2024-05-10T09:09:46+08:00

2

playORM可以使用S-SQL（可伸缩SQL）为您完成，它只是添加分区，以便您可以在分区中进行连接 .

回复于 2024-05-10T09:09:46+08:00
134
$ lookup（聚合）

对同一数据库中的未整数集合执行左外连接，以过滤“已连接”集合中的文档以进行处理 . 对于每个输入文档，$ lookup阶段添加一个新的数组字段，其元素是来自“已连接”集合的匹配文档 . $ lookup阶段将这些重新整形的文档传递给下一个阶段 . $ lookup阶段具有以下语法：

平等比赛

要在输入文档中的字段与“已连接”集合的文档中的字段之间执行相等匹配，$ lookup阶段具有以下语法：
```
{
   $lookup:
     {
       from: <collection to join>,
       localField: <field from the input documents>,
       foreignField: <field from the documents of the "from" collection>,
       as: <output array field>
     }
}
```
该操作将对应于以下伪SQL语句：
```
SELECT *, <output array field>
FROM collection
WHERE <output array field> IN (SELECT <documents as determined from the pipeline>
                               FROM <collection to join>
                               WHERE <pipeline> );
```
Mongo URL
回复于 2024-05-10T09:09:46+08:00
-3
不，似乎你做错了 . MongoDB加入是“客户端” . 非常像你说的：

目前，我首先得到符合我标准的评论，然后找出该结果集中的所有uid，获取用户对象，并将它们与评论的结果合并 . 好像我做错了 .
```
1) Select from the collection you're interested in.
2) From that collection pull out ID's you need
3) Select from other collections
4) Decorate your original results.
```
它不是一个“真正的”连接，但实际上它比SQL连接更有用，因为你不必处理“多”面连接的重复行，而是装饰最初选择的集合 .

这个页面上有很多废话和FUD . 5年后，MongoDB仍然是一个问题 .
回复于 2024-05-10T09:09:46+08:00
10

你必须按照你描述的方式去做 . MongoDB是一个非关系型数据库，不支持连接 .

回复于 2024-05-10T09:09:46+08:00
17

官方mongodb网站上的这个页面正好解决了这个问题：

http://docs.mongodb.org/ecosystem/tutorial/model-data-for-ruby-on-rails/

当我们显示故事列表时，我们需要显示发布故事的用户的姓名 . 如果我们使用关系数据库，我们可以在用户和商店上执行连接，并在单个查询中获取所有对象 . 但MongoDB不支持连接，因此有时需要一些非规范化 . 在这里，这意味着缓存'username'属性 . 关系纯粹主义者可能已经感到不安，好像我们违反了一些普遍的法律 . 但是请记住，MongoDB集合不等同于关系表;每个都有一个独特的设计目标 . 规范化表提供原子的，孤立的数据块 . 然而，文档更接近地代表整个对象 . 在社交新闻网站的情况下，可以认为用户名是发布的故事所固有的 .

回复于 2024-05-10T09:09:46+08:00

通过 $lookup ， $project 和 $match 的正确组合，您可以在多个参数上连接多个表 . 这是因为它们可以链接多次 .

假设我们想做以下（reference）

SELECT S.* FROM LeftTable S
LEFT JOIN RightTable R ON S.ID =R.ID AND S.MID =R.MID WHERE R.TIM >0 AND 
S.MOB IS NOT NULL

Step 1: Link all tables

您可以根据需要查找任意数量的表 .

$lookup - 查询中每个表一个

$unwind - 因为数据是非正规化的，否则包裹在数组中

Python代码..

db.LeftTable.aggregate([
                        # connect all tables

                        {"$lookup": {
                          "from": "RightTable",
                          "localField": "ID",
                          "foreignField": "ID",
                          "as": "R"
                        }},
                        {"$unwind": "R"}

                        ])

Step 2: Define all conditionals

$project ：在此定义所有条件语句，以及您要选择的所有变量 .

Python代码..

db.LeftTable.aggregate([
                        # connect all tables

                        {"$lookup": {
                          "from": "RightTable",
                          "localField": "ID",
                          "foreignField": "ID",
                          "as": "R"
                        }},
                        {"$unwind": "R"},

                        # define conditionals + variables

                        {"$project": {
                          "midEq": {"$eq": ["$MID", "$R.MID"]},
                          "ID": 1, "MOB": 1, "MID": 1
                        }}
                        ])

Step 3: Join all the conditionals

$match - 使用OR或AND等连接所有条件 . 可以有多个这些条件 .

$project ：取消定义所有条件

Python代码..

db.LeftTable.aggregate([
                        # connect all tables

                        {"$lookup": {
                          "from": "RightTable",
                          "localField": "ID",
                          "foreignField": "ID",
                          "as": "R"
                        }},
                        {"$unwind": "$R"},

                        # define conditionals + variables

                        {"$project": {
                          "midEq": {"$eq": ["$MID", "$R.MID"]},
                          "ID": 1, "MOB": 1, "MID": 1
                        }},

                        # join all conditionals

                        {"$match": {
                          "$and": [
                            {"R.TIM": {"$gt": 0}}, 
                            {"MOB": {"$exists": True}},
                            {"midEq": {"$eq": True}}
                        ]}},

                        # undefine conditionals

                        {"$project": {
                          "midEq": 0
                        }}

                        ])

几乎任何表，条件和连接的组合都可以这种方式完成 .

回复于 2024-05-10T09:09:46+08:00

在 3.2.6 之前，Mongodb不像mysql那样支持连接查询 . 以下解决方案适合您 .

db.getCollection('comments').aggregate([
        {$match : {pid : 444}},
        {$lookup: {from: "users",localField: "uid",foreignField: "uid",as: "userData"}},
   ])

回复于 2024-05-10T09:09:46+08:00

128
从Mongo 3.2开始，这个问题的答案大多不再正确 . 添加到聚合管道的新$ lookup运算符与左外连接基本相同：

https://docs.mongodb.org/master/reference/operator/aggregation/lookup/#pipe._S_lookup

来自文档：
```
{
   $lookup:
     {
       from: <collection to join>,
       localField: <field from the input documents>,
       foreignField: <field from the documents of the "from" collection>,
       as: <output array field>
     }
}
```
当然Mongo不是一个关系数据库，开发人员正在小心推荐$ lookup的特定用例，但至少是3.2现在可以使用MongoDB进行连接 .
回复于 2024-05-10T09:09:46+08:00
9

有一个规范，许多驱动程序支持称为DBRef .

DBRef是用于在文档之间创建引用的更正式的规范 . DBRefs（通常）包括集合名称和对象ID . 如果集合可以从一个文档更改为下一个文档，则大多数开发人员仅使用DBRef . 如果您引用的集合始终相同，则上面列出的手动参考更有效 .

摘自MongoDB文档：数据模型>数据模型参考> Database References

回复于 2024-05-10T09:09:46+08:00

244

您可以使用3.2版本中提供的查找在Mongo中加入两个集合 . 在您的情况下，查询将是

db.comments.aggregate({
    $lookup:{
        from:"users",
        localField:"uid",
        foreignField:"uid",
        as:"users_comments"
    }
})

或者您也可以加入用户，然后会有一点变化，如下所示 .

db.users.aggregate({
    $lookup:{
        from:"comments",
        localField:"uid",
        foreignField:"uid",
        as:"users_comments"
    }
})

它将与SQL中的左右连接一样工作 .

回复于 2024-05-10T09:09:46+08:00

0

以下是"join" * Actors 和 Movies 集合的示例：

https://github.com/mongodb/cookbook/blob/master/content/patterns/pivot.txt

它使用 .mapReduce() 方法

*** join** - 加入面向文档的数据库的替代方案

回复于 2024-05-10T09:09:46+08:00
-4
我认为，如果您需要规范化数据表 - 您需要尝试其他一些数据库解决方案 .

但是我已经在_355251上找到了MOngo的溶剂 . 顺便说一下，在插入代码中 - 它有电影的名字， but noi movie's ID .

问题

你有一系列演员，他们已经完成了一系列电影 .

您想要生成一组电影，每个电影都有一系列演员 .

一些样本数据
```
db.actors.insert( { actor: "Richard Gere", movies: ['Pretty Woman', 'Runaway Bride', 'Chicago'] });
 db.actors.insert( { actor: "Julia Roberts", movies: ['Pretty Woman', 'Runaway Bride', 'Erin Brockovich'] });
```
解决方案

我们需要遍历Actor文档中的每个影片并单独发出每个影片 .

这里的问题是在减少阶段 . 我们无法从reduce阶段发出数组，因此我们必须在返回的“value”文档中构建一个Actors数组 .
代码
```
map = function() {
  for(var i in this.movies){
    key = { movie: this.movies[i] };
    value = { actors: [ this.actor ] };
    emit(key, value);
  }
}

reduce = function(key, values) {
  actor_list = { actors: [] };
  for(var i in values) {
    actor_list.actors = values[i].actors.concat(actor_list.actors);
  }
  return actor_list;
}
```
请注意actor_list实际上是一个包含数组的javascript对象 . 另请注意， Map 会发出相同的结构 .

运行以下命令执行map / reduce，将其输出到“pivot”集合并打印结果：

printjson（db.actors.mapReduce（map，reduce，“pivot”））; db.pivot.find（）的forEach（printjson） .

以下是样本输出，请注意“漂亮女人”和“失控新娘”都有“Richard Gere”和“Julia Roberts” .
```
{ "_id" : { "movie" : "Chicago" }, "value" : { "actors" : [ "Richard Gere" ] } }
{ "_id" : { "movie" : "Erin Brockovich" }, "value" : { "actors" : [ "Julia Roberts" ] } }
{ "_id" : { "movie" : "Pretty Woman" }, "value" : { "actors" : [ "Richard Gere", "Julia Roberts" ] } }
{ "_id" : { "movie" : "Runaway Bride" }, "value" : { "actors" : [ "Richard Gere", "Julia Roberts" ] } }
```
回复于 2024-05-10T09:09:46+08:00
-2

这取决于你想要做什么 .

您目前将其设置为规范化数据库，这很好，并且您采用的方式也是合适的 .

但是，还有其他方法可以做到这一点 .

您可以拥有一个帖子集合，其中包含对每个帖子的嵌入评论，并引用您可以迭代查询的用户 . 您可以将用户的名称与注释一起存储，您可以将它们全部存储在一个文档中 .

NoSQL的用途是它设计用于灵活的模式和非常快速的读写 . 在典型的大数据服务器场中，数据库是最大的瓶颈，数据库引擎比应用程序和前端服务器更少......它们更昂贵但功能更强大，硬盘空间也比较便宜 . 规范化来自于试图节省空间的概念，但它带来了使数据库执行复杂连接和验证关系完整性，执行级联操作的成本 . 如果他们正确地设计了数据库，所有这些都会让开发人员感到头疼 .

对于NoSQL，如果您认为冗余和存储空间不是问题，因为它们的成本（执行更新所需的处理器时间和存储额外数据的硬盘驱动器成本），非规范化不是问题（对于嵌入式阵列而言）数十万个项目可能是性能问题，但大多数时候这不是问题） . 此外，您将为每个数据库集群提供多个应用程序和前端服务器 . 让他们完成连接的繁重工作，让数据库服务器坚持读写 .

TL; DR：对于一些很好的例子你是什么're doing is fine, and there are other ways of doing it. Check out the mongodb documentation'的数据模型模式 . http://docs.mongodb.org/manual/data-modeling/

回复于 2024-05-10T09:09:46+08:00
0

您可以使用Postgres的mongo_fdw运行SQL查询，包括使用MongoDB加入 .

回复于 2024-05-10T09:09:46+08:00

我们可以使用mongoDB子查询合并两个集合 . 这是一个例子，评论 -

`db.commentss.insert([
  { uid:12345, pid:444, comment:"blah" },
  { uid:12345, pid:888, comment:"asdf" },
  { uid:99999, pid:444, comment:"qwer" }])`

Userss--

db.userss.insert([
  { uid:12345, name:"john" },
  { uid:99999, name:"mia"  }])

用于JOIN的MongoDB子查询 -

`db.commentss.find().forEach(
    function (newComments) {
        newComments.userss = db.userss.find( { "uid": newComments.uid } ).toArray();
        db.newCommentUsers.insert(newComments);
    }
);`

从新生成的Collection中获取结果 -

db.newCommentUsers.find().pretty()

结果 -

`{
    "_id" : ObjectId("5511236e29709afa03f226ef"),
    "uid" : 12345,
    "pid" : 444,
    "comment" : "blah",
    "userss" : [
        {
            "_id" : ObjectId("5511238129709afa03f226f2"),
            "uid" : 12345,
            "name" : "john"
        }
    ]
}
{
    "_id" : ObjectId("5511236e29709afa03f226f0"),
    "uid" : 12345,
    "pid" : 888,
    "comment" : "asdf",
    "userss" : [
        {
            "_id" : ObjectId("5511238129709afa03f226f2"),
            "uid" : 12345,
            "name" : "john"
        }
    ]
}
{
    "_id" : ObjectId("5511236e29709afa03f226f1"),
    "uid" : 99999,
    "pid" : 444,
    "comment" : "qwer",
    "userss" : [
        {
            "_id" : ObjectId("5511238129709afa03f226f3"),
            "uid" : 99999,
            "name" : "mia"
        }
    ]
}`

希望这会有所帮助 .

回复于 2024-05-10T09:09:46+08:00

7
正如其他人已经指出你试图从无关系数据库创建一个关系数据库，你真的不想这样做，但是如果你有一个案例你必须这样做，这是一个你可以使用的解决方案 . 我们首先在集合A上找到一个foreach查找（或在你的用例中用户），然后我们将每个项目作为一个对象，然后我们使用对象属性（在你的情况下为uid）在我们的第二个集合中查找（在你的案例评论中）如果我们可以找到它然后我们有一个匹配，我们可以打印或做一些事情 . 希望这能帮助你，祝你好运:)
```
db.users.find().forEach(
function (object) {
    var commonInBoth=db.comments.findOne({ "uid": object.uid} );
    if (commonInBoth != null) {
        printjson(commonInBoth) ;
        printjson(object) ;
    }else {
        // did not match so we don't care in this case
    }
});
```
回复于 2024-05-10T09:09:46+08:00

MongoDB不允许连接，但您可以使用插件来处理那 . 检查mongo-join插件 . 这是最好的，我已经使用过它 . 您可以像这样使用npm直接安装它 npm install mongo-join . 你可以看看full documentation with examples .

（）当我们需要加入（N）集合时真正有用的工具

（ - ）我们可以在查询的顶层应用条件

Example

var Join = require('mongo-join').Join, mongodb = require('mongodb'), Db = mongodb.Db, Server = mongodb.Server;
db.open(function (err, Database) {
    Database.collection('Appoint', function (err, Appoints) {

        /* we can put conditions just on the top level */
        Appoints.find({_id_Doctor: id_doctor ,full_date :{ $gte: start_date },
            full_date :{ $lte: end_date }}, function (err, cursor) {
            var join = new Join(Database).on({
                field: '_id_Doctor', // <- field in Appoints document
                to: '_id',         // <- field in User doc. treated as ObjectID automatically.
                from: 'User'  // <- collection name for User doc
            }).on({
                field: '_id_Patient', // <- field in Appoints doc
                to: '_id',         // <- field in User doc. treated as ObjectID automatically.
                from: 'User'  // <- collection name for User doc
            })
            join.toArray(cursor, function (err, joinedDocs) {

                /* do what ever you want here */
                /* you can fetch the table and apply your own conditions */
                .....
                .....
                .....


                resp.status(200);
                resp.json({
                    "status": 200,
                    "message": "success",
                    "Appoints_Range": joinedDocs,


                });
                return resp;


            });

    });

回复于 2024-05-10T09:09:46+08:00

35
您可以使用聚合管道来完成它，但是自己编写它会很痛苦 .

您可以使用mongo-join-query从查询中自动创建聚合管道 .

这是您的查询的样子：
```
const mongoose = require("mongoose");
const joinQuery = require("mongo-join-query");

joinQuery(
    mongoose.models.Comment,
    {
        find: { pid:444 },
        populate: ["uid"]
    },
    (err, res) => (err ? console.log("Error:", err) : console.log("Success:", res.results))
);
```
您的结果将在 uid 字段中包含用户对象，您可以根据需要链接多个级别 . 您可以填充对用户的引用，该引用引用了一个Team，它引用了其他内容等 .

Disclaimer ：我写了 mongo-join-query 来解决这个确切的问题 .
回复于 2024-05-10T09:09:46+08:00

如何在MongoDB中执行SQL Join等效操作？

19 回答

$ lookup（聚合）

平等比赛

我认为，如果您需要规范化数据表 - 您需要尝试其他一些数据库解决方案 .

问题

解决方案

相关问题