$lookup multiple levels without $unwind?












4















I have following collections



venue collection



{
"_id" : ObjectId("5acdb8f65ea63a27c1facf86"),
"name" : "ASA College - Manhattan Campus",
"addedBy" : ObjectId("5ac8ba3582c2345af70d4658"),
"reviews" : [
ObjectId("5acdb8f65ea63a27c1facf8b"),
ObjectId("5ad8288ccdd9241781dce698")
]
}


reviews collection



{
"_id" : ObjectId("5acdb8f65ea63a27c1facf8b"),
"createdAt" : ISODate("2018-04-07T12:31:49.503Z"),
"venue" : ObjectId("5acdb8f65ea63a27c1facf86"),
"author" : ObjectId("5ac8ba3582c2345af70d4658"),
"content" : "nice place",
"comments" : [
ObjectId("5ad87113882d445c5cbc92c8")
],
}


comment collection



{
"_id" : ObjectId("5ad87113882d445c5cbc92c8"),
"author" : ObjectId("5ac8ba3582c2345af70d4658"),
"comment" : "dcfdsfdcfdsfdcfdsfdcfdsfdcfdsfdcfdsfdcfdsfdcfdsf",
"review" : ObjectId("5acdb8f65ea63a27c1facf8b"),
"__v" : 0
}


author collection



{
"_id" : ObjectId("5ac8ba3582c2345af70d4658"),
"firstName" : "Bruce",
"lastName" : "Wayne",
"email" : "bruce@linkites.com",
"followers" : [ObjectId("5ac8b91482c2345af70d4650")]
}


Now My following populate query works fine



    const venues = await Venue.findOne({ _id: id.id })
.populate({
path: 'reviews',
options: { sort: { createdAt: -1 } },
populate: [
{ path: 'author' },
{ path: 'comments', populate: [{ path: 'author' }] }
]
})


But I want to achieve it with $lookup query but it splits the venue when I am doing '$unwind' to the reviews... I want reviews in same array (like populate) and in same order...



I want to achieve following query with $lookup because author have followers field so I need to send field isFollow by doing $project which cannot be done using populate...



$project: {
isFollow: { $in: [mongoose.Types.ObjectId(req.user.id), '$followers'] }
}









share|improve this question




















  • 1





    $lookup won't replace objects correctly more than one level deep. You can somewhat "ridiculously" do this with MongoDB 3.6, though I personally don't find at as intuitive. Generally the process will be to $unwind and then reconstruct back to arrays using $group.

    – Neil Lunn
    Apr 21 '18 at 8:27











  • @NeilLunn My main concern is with isFollow key... can I send that key using populate... Or is there any other way to do it?

    – Anthony Winzlet
    Apr 21 '18 at 8:54


















4















I have following collections



venue collection



{
"_id" : ObjectId("5acdb8f65ea63a27c1facf86"),
"name" : "ASA College - Manhattan Campus",
"addedBy" : ObjectId("5ac8ba3582c2345af70d4658"),
"reviews" : [
ObjectId("5acdb8f65ea63a27c1facf8b"),
ObjectId("5ad8288ccdd9241781dce698")
]
}


reviews collection



{
"_id" : ObjectId("5acdb8f65ea63a27c1facf8b"),
"createdAt" : ISODate("2018-04-07T12:31:49.503Z"),
"venue" : ObjectId("5acdb8f65ea63a27c1facf86"),
"author" : ObjectId("5ac8ba3582c2345af70d4658"),
"content" : "nice place",
"comments" : [
ObjectId("5ad87113882d445c5cbc92c8")
],
}


comment collection



{
"_id" : ObjectId("5ad87113882d445c5cbc92c8"),
"author" : ObjectId("5ac8ba3582c2345af70d4658"),
"comment" : "dcfdsfdcfdsfdcfdsfdcfdsfdcfdsfdcfdsfdcfdsfdcfdsf",
"review" : ObjectId("5acdb8f65ea63a27c1facf8b"),
"__v" : 0
}


author collection



{
"_id" : ObjectId("5ac8ba3582c2345af70d4658"),
"firstName" : "Bruce",
"lastName" : "Wayne",
"email" : "bruce@linkites.com",
"followers" : [ObjectId("5ac8b91482c2345af70d4650")]
}


Now My following populate query works fine



    const venues = await Venue.findOne({ _id: id.id })
.populate({
path: 'reviews',
options: { sort: { createdAt: -1 } },
populate: [
{ path: 'author' },
{ path: 'comments', populate: [{ path: 'author' }] }
]
})


But I want to achieve it with $lookup query but it splits the venue when I am doing '$unwind' to the reviews... I want reviews in same array (like populate) and in same order...



I want to achieve following query with $lookup because author have followers field so I need to send field isFollow by doing $project which cannot be done using populate...



$project: {
isFollow: { $in: [mongoose.Types.ObjectId(req.user.id), '$followers'] }
}









share|improve this question




















  • 1





    $lookup won't replace objects correctly more than one level deep. You can somewhat "ridiculously" do this with MongoDB 3.6, though I personally don't find at as intuitive. Generally the process will be to $unwind and then reconstruct back to arrays using $group.

    – Neil Lunn
    Apr 21 '18 at 8:27











  • @NeilLunn My main concern is with isFollow key... can I send that key using populate... Or is there any other way to do it?

    – Anthony Winzlet
    Apr 21 '18 at 8:54
















4












4








4


6






I have following collections



venue collection



{
"_id" : ObjectId("5acdb8f65ea63a27c1facf86"),
"name" : "ASA College - Manhattan Campus",
"addedBy" : ObjectId("5ac8ba3582c2345af70d4658"),
"reviews" : [
ObjectId("5acdb8f65ea63a27c1facf8b"),
ObjectId("5ad8288ccdd9241781dce698")
]
}


reviews collection



{
"_id" : ObjectId("5acdb8f65ea63a27c1facf8b"),
"createdAt" : ISODate("2018-04-07T12:31:49.503Z"),
"venue" : ObjectId("5acdb8f65ea63a27c1facf86"),
"author" : ObjectId("5ac8ba3582c2345af70d4658"),
"content" : "nice place",
"comments" : [
ObjectId("5ad87113882d445c5cbc92c8")
],
}


comment collection



{
"_id" : ObjectId("5ad87113882d445c5cbc92c8"),
"author" : ObjectId("5ac8ba3582c2345af70d4658"),
"comment" : "dcfdsfdcfdsfdcfdsfdcfdsfdcfdsfdcfdsfdcfdsfdcfdsf",
"review" : ObjectId("5acdb8f65ea63a27c1facf8b"),
"__v" : 0
}


author collection



{
"_id" : ObjectId("5ac8ba3582c2345af70d4658"),
"firstName" : "Bruce",
"lastName" : "Wayne",
"email" : "bruce@linkites.com",
"followers" : [ObjectId("5ac8b91482c2345af70d4650")]
}


Now My following populate query works fine



    const venues = await Venue.findOne({ _id: id.id })
.populate({
path: 'reviews',
options: { sort: { createdAt: -1 } },
populate: [
{ path: 'author' },
{ path: 'comments', populate: [{ path: 'author' }] }
]
})


But I want to achieve it with $lookup query but it splits the venue when I am doing '$unwind' to the reviews... I want reviews in same array (like populate) and in same order...



I want to achieve following query with $lookup because author have followers field so I need to send field isFollow by doing $project which cannot be done using populate...



$project: {
isFollow: { $in: [mongoose.Types.ObjectId(req.user.id), '$followers'] }
}









share|improve this question
















I have following collections



venue collection



{
"_id" : ObjectId("5acdb8f65ea63a27c1facf86"),
"name" : "ASA College - Manhattan Campus",
"addedBy" : ObjectId("5ac8ba3582c2345af70d4658"),
"reviews" : [
ObjectId("5acdb8f65ea63a27c1facf8b"),
ObjectId("5ad8288ccdd9241781dce698")
]
}


reviews collection



{
"_id" : ObjectId("5acdb8f65ea63a27c1facf8b"),
"createdAt" : ISODate("2018-04-07T12:31:49.503Z"),
"venue" : ObjectId("5acdb8f65ea63a27c1facf86"),
"author" : ObjectId("5ac8ba3582c2345af70d4658"),
"content" : "nice place",
"comments" : [
ObjectId("5ad87113882d445c5cbc92c8")
],
}


comment collection



{
"_id" : ObjectId("5ad87113882d445c5cbc92c8"),
"author" : ObjectId("5ac8ba3582c2345af70d4658"),
"comment" : "dcfdsfdcfdsfdcfdsfdcfdsfdcfdsfdcfdsfdcfdsfdcfdsf",
"review" : ObjectId("5acdb8f65ea63a27c1facf8b"),
"__v" : 0
}


author collection



{
"_id" : ObjectId("5ac8ba3582c2345af70d4658"),
"firstName" : "Bruce",
"lastName" : "Wayne",
"email" : "bruce@linkites.com",
"followers" : [ObjectId("5ac8b91482c2345af70d4650")]
}


Now My following populate query works fine



    const venues = await Venue.findOne({ _id: id.id })
.populate({
path: 'reviews',
options: { sort: { createdAt: -1 } },
populate: [
{ path: 'author' },
{ path: 'comments', populate: [{ path: 'author' }] }
]
})


But I want to achieve it with $lookup query but it splits the venue when I am doing '$unwind' to the reviews... I want reviews in same array (like populate) and in same order...



I want to achieve following query with $lookup because author have followers field so I need to send field isFollow by doing $project which cannot be done using populate...



$project: {
isFollow: { $in: [mongoose.Types.ObjectId(req.user.id), '$followers'] }
}






node.js mongodb mongoose mongodb-query aggregation-framework






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 7 '18 at 9:19







Anthony Winzlet

















asked Apr 21 '18 at 8:20









Anthony WinzletAnthony Winzlet

17.7k42245




17.7k42245








  • 1





    $lookup won't replace objects correctly more than one level deep. You can somewhat "ridiculously" do this with MongoDB 3.6, though I personally don't find at as intuitive. Generally the process will be to $unwind and then reconstruct back to arrays using $group.

    – Neil Lunn
    Apr 21 '18 at 8:27











  • @NeilLunn My main concern is with isFollow key... can I send that key using populate... Or is there any other way to do it?

    – Anthony Winzlet
    Apr 21 '18 at 8:54
















  • 1





    $lookup won't replace objects correctly more than one level deep. You can somewhat "ridiculously" do this with MongoDB 3.6, though I personally don't find at as intuitive. Generally the process will be to $unwind and then reconstruct back to arrays using $group.

    – Neil Lunn
    Apr 21 '18 at 8:27











  • @NeilLunn My main concern is with isFollow key... can I send that key using populate... Or is there any other way to do it?

    – Anthony Winzlet
    Apr 21 '18 at 8:54










1




1





$lookup won't replace objects correctly more than one level deep. You can somewhat "ridiculously" do this with MongoDB 3.6, though I personally don't find at as intuitive. Generally the process will be to $unwind and then reconstruct back to arrays using $group.

– Neil Lunn
Apr 21 '18 at 8:27





$lookup won't replace objects correctly more than one level deep. You can somewhat "ridiculously" do this with MongoDB 3.6, though I personally don't find at as intuitive. Generally the process will be to $unwind and then reconstruct back to arrays using $group.

– Neil Lunn
Apr 21 '18 at 8:27













@NeilLunn My main concern is with isFollow key... can I send that key using populate... Or is there any other way to do it?

– Anthony Winzlet
Apr 21 '18 at 8:54







@NeilLunn My main concern is with isFollow key... can I send that key using populate... Or is there any other way to do it?

– Anthony Winzlet
Apr 21 '18 at 8:54














1 Answer
1






active

oldest

votes


















12














There are a couple of approaches of course depending on your available MongoDB version. These vary from different usages of $lookup through to enabling object manipulation on the .populate() result via .lean().



I do ask that you read the sections carefully, and be aware that all may not be as it seems when considering your implementation solution.



MongoDB 3.6, "nested" $lookup



With MongoDB 3.6 the $lookup operator gets the additional ability to include a pipeline expression as opposed to simply joining a "local" to "foreign" key value, what this means is you can essentially do each $lookup as "nested" within these pipeline expressions



Venue.aggregate([
{ "$match": { "_id": mongoose.Types.ObjectId(id.id) } },
{ "$lookup": {
"from": Review.collection.name,
"let": { "reviews": "$reviews" },
"pipeline": [
{ "$match": { "$expr": { "$in": [ "$_id", "$$reviews" ] } } },
{ "$lookup": {
"from": Comment.collection.name,
"let": { "comments": "$comments" },
"pipeline": [
{ "$match": { "$expr": { "$in": [ "$_id", "$$comments" ] } } },
{ "$lookup": {
"from": Author.collection.name,
"let": { "author": "$author" },
"pipeline": [
{ "$match": { "$expr": { "$eq": [ "$_id", "$$author" ] } } },
{ "$addFields": {
"isFollower": {
"$in": [
mongoose.Types.ObjectId(req.user.id),
"$followers"
]
}
}}
],
"as": "author"
}},
{ "$addFields": {
"author": { "$arrayElemAt": [ "$author", 0 ] }
}}
],
"as": "comments"
}},
{ "$sort": { "createdAt": -1 } }
],
"as": "reviews"
}},
])


This can be really quite powerful, as you see from the perspective of the original pipeline, it really only knows about adding content to the "reviews" array and then each subsequent "nested" pipeline expression also only ever sees it's "inner" elements from the join.



It is powerful and in some respects it may be a bit clearer as all field paths are relative to the nesting level, but it does start that indentation creep in the BSON structure, and you do need to be aware of whether you are matching to arrays or singular values in traversing the structure.



Note we can also do things here like "flattening the author property" as seen within the "comments" array entries. All $lookup target output may be an "array", but within a "sub-pipeline" we can re-shape that single element array into just a single value.



Standard MongoDB $lookup



Still keeping the "join on the server" you can actually do it with $lookup, but it just takes intermediate processing. This is the long standing approach with deconstructing an array with $unwind and the using $group stages to rebuild arrays:



Venue.aggregate([
{ "$match": { "_id": mongoose.Types.ObjectId(id.id) } },
{ "$lookup": {
"from": Review.collection.name,
"localField": "reviews",
"foreignField": "_id",
"as": "reviews"
}},
{ "$unwind": "$reviews" },
{ "$lookup": {
"from": Comment.collection.name,
"localField": "reviews.comments",
"foreignField": "_id",
"as": "reviews.comments",
}},
{ "$unwind": "$reviews.comments" },
{ "$lookup": {
"from": Author.collection.name,
"localField": "reviews.comments.author",
"foreignField": "_id",
"as": "reviews.comments.author"
}},
{ "$unwind": "$reviews.comments.author" },
{ "$addFields": {
"reviews.comments.author.isFollower": {
"$in": [
mongoose.Types.ObjectId(req.user.id),
"$reviews.comments.author.followers"
]
}
}},
{ "$group": {
"_id": {
"_id": "$_id",
"reviewId": "$review._id"
},
"name": { "$first": "$name" },
"addedBy": { "$first": "$addedBy" },
"review": {
"$first": {
"_id": "$review._id",
"createdAt": "$review.createdAt",
"venue": "$review.venue",
"author": "$review.author",
"content": "$review.content"
}
},
"comments": { "$push": "$reviews.comments" }
}},
{ "$sort": { "_id._id": 1, "review.createdAt": -1 } },
{ "$group": {
"_id": "$_id._id",
"name": { "$first": "$name" },
"addedBy": { "$first": "$addedBy" },
"reviews": {
"$push": {
"_id": "$review._id",
"venue": "$review.venue",
"author": "$review.author",
"content": "$review.content",
"comments": "$comments"
}
}
}}
])


This really is not as daunting as you might think at first and follows a simple pattern of $lookup and $unwind as you progress through each array.



The "author" detail of course is singular, so once that is "unwound" you simply want to leave it that way, make the field addition and start the process of "rolling back" into the arrays.



There are only two levels to reconstruct back to the original Venue document, so the first detail level is by Review to rebuild the "comments" array. All you need to is to $push the path of "$reviews.comments" in order to collect these, and as long as the "$reviews._id" field is in the "grouping _id" the only other things you need to keep are all the other fields. You can put all of these into the _id as well, or you can use $first.



With that done there is only one more $group stage in order to get back to Venue itself. This time the grouping key is "$_id" of course, with all properties of the venue itself using $first and the remaining "$review" details going back into an array with $push. Of course the "$comments" output from the previous $group becomes the "review.comments" path.



Working on a single document and it's relations, this is not really so bad. The $unwind pipeline operator can generally be a performance issue, but in the context of this usage it should not really cause that much of an impact.



Since the data is still being "joined on the server" there is still far less traffic than the other remaining alternative.



JavaScript Manipulation



Of course the other case here is that instead of changing data on the server itself, you actually manipulate the result. In most cases I would be in favor of this approach since any "additions" to the data are probably best handled on the client.



The problem of course with using populate() is that whilst it may 'look like' a much more simplified process, it is in fact NOT A JOIN in any way. All populate() actually does is "hide" the underlying process of submitting multiple queries to the database, and then awaiting the results through async handling.



So the "appearance" of a join is actually the result of multiple requests to the server and then doing "client side manipulation" of the data to embed the details within arrays.



So aside from that clear warning that the performance characteristics are nowhere close to being on par with a server $lookup, the other caveat is of course that the "mongoose Documents" in the result are not actually plain JavaScript objects subject to further manipulation.



So in order to take this approach, you need to add the .lean() method to the query before execution, in order to instruct mongoose to return "plain JavaScript objects" instead of Document types which are cast with schema methods attached to the model. Noting of course that the resulting data no longer has access to any "instance methods" that would otherwise be associated with the related models themselves:



let venue = await Venue.findOne({ _id: id.id })
.populate({
path: 'reviews',
options: { sort: { createdAt: -1 } },
populate: [
{ path: 'comments', populate: [{ path: 'author' }] }
]
})
.lean();


Now venue is a plain object, we can simply process and adjust as needed:



venue.reviews = venue.reviews.map( r => 
({
...r,
comments: r.comments.map( c =>
({
...c,
author: {
...c.author,
isAuthor: c.author.followers.map( f => f.toString() ).indexOf(req.user.id) != -1
}
})
)
})
);


So it's really just a matter of cycling through each of the inner arrays down until the level where you can see the followers array within the author details. The comparison then can be made against the ObjectId values stored in that array after first using .map() to return the "string" values for comparison against the req.user.id which is also a string (if it is not, then also add .toString() on that ), since it is easier in general to compare these values in this way via JavaScript code.



Again though I need to stress that it "looks simple" but it is in fact the sort of thing you really want to avoid for system performance, as those additional queries and the transfer between the server and the client cost a lot in time of processing and even due to the request overhead this adds up to real costs in transport between hosting providers.





Summary



Those are basically your approaches you can take, short of "rolling your own" where you actually perform the "multiple queries" to the database yourself instead of using the helper that .populate() is.



Using the populate output, you can then simply manipulate the data in result just like any other data structure, as long as you apply .lean() to the query to convert or otherwise extract the plain object data from the mongoose documents returned.



Whilst the aggregate approaches look far more involved, there are "a lot" more advantages to doing this work on the server. Larger result sets can be sorted, calculations can be done for further filtering, and of course you get a "single response" to a "single request" made to the server, all with no additional overhead.



It is totally arguable that the pipelines themselves could simply be constructed based on attributes already stored on the schema. So writing your own method to perform this "construction" based on the attached schema should not be too difficult.



In the longer term of course $lookup is the better solution, but you'll probably need to put a little more work into the initial coding, if of course you don't just simply copy from what is listed here ;)






share|improve this answer
























  • Perfect man... You are magician... It took 3 days to properly understand the query... I think after nested lookup the populate queries will become extinct from mongodb... Thanks man .... Awesome

    – Anthony Winzlet
    Apr 23 '18 at 16:46








  • 1





    @AshishChoudhary If you have a new question then please Ask a New Question instead of asking for more details on an existing answer. That is how questions work on this site. Of course aggregation pipelines have $project which you can basically use anywhere and especially where $addFields is used in example here as that pipeline stage "adds" to the existing fields where with $project you specify explicitly which fields to return. If you need to ask more about that then you know what to do.

    – Neil Lunn
    May 9 '18 at 8:15











  • Awesome answer :)

    – Petrov
    Oct 27 '18 at 19:36











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f49953780%2flookup-multiple-levels-without-unwind%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









12














There are a couple of approaches of course depending on your available MongoDB version. These vary from different usages of $lookup through to enabling object manipulation on the .populate() result via .lean().



I do ask that you read the sections carefully, and be aware that all may not be as it seems when considering your implementation solution.



MongoDB 3.6, "nested" $lookup



With MongoDB 3.6 the $lookup operator gets the additional ability to include a pipeline expression as opposed to simply joining a "local" to "foreign" key value, what this means is you can essentially do each $lookup as "nested" within these pipeline expressions



Venue.aggregate([
{ "$match": { "_id": mongoose.Types.ObjectId(id.id) } },
{ "$lookup": {
"from": Review.collection.name,
"let": { "reviews": "$reviews" },
"pipeline": [
{ "$match": { "$expr": { "$in": [ "$_id", "$$reviews" ] } } },
{ "$lookup": {
"from": Comment.collection.name,
"let": { "comments": "$comments" },
"pipeline": [
{ "$match": { "$expr": { "$in": [ "$_id", "$$comments" ] } } },
{ "$lookup": {
"from": Author.collection.name,
"let": { "author": "$author" },
"pipeline": [
{ "$match": { "$expr": { "$eq": [ "$_id", "$$author" ] } } },
{ "$addFields": {
"isFollower": {
"$in": [
mongoose.Types.ObjectId(req.user.id),
"$followers"
]
}
}}
],
"as": "author"
}},
{ "$addFields": {
"author": { "$arrayElemAt": [ "$author", 0 ] }
}}
],
"as": "comments"
}},
{ "$sort": { "createdAt": -1 } }
],
"as": "reviews"
}},
])


This can be really quite powerful, as you see from the perspective of the original pipeline, it really only knows about adding content to the "reviews" array and then each subsequent "nested" pipeline expression also only ever sees it's "inner" elements from the join.



It is powerful and in some respects it may be a bit clearer as all field paths are relative to the nesting level, but it does start that indentation creep in the BSON structure, and you do need to be aware of whether you are matching to arrays or singular values in traversing the structure.



Note we can also do things here like "flattening the author property" as seen within the "comments" array entries. All $lookup target output may be an "array", but within a "sub-pipeline" we can re-shape that single element array into just a single value.



Standard MongoDB $lookup



Still keeping the "join on the server" you can actually do it with $lookup, but it just takes intermediate processing. This is the long standing approach with deconstructing an array with $unwind and the using $group stages to rebuild arrays:



Venue.aggregate([
{ "$match": { "_id": mongoose.Types.ObjectId(id.id) } },
{ "$lookup": {
"from": Review.collection.name,
"localField": "reviews",
"foreignField": "_id",
"as": "reviews"
}},
{ "$unwind": "$reviews" },
{ "$lookup": {
"from": Comment.collection.name,
"localField": "reviews.comments",
"foreignField": "_id",
"as": "reviews.comments",
}},
{ "$unwind": "$reviews.comments" },
{ "$lookup": {
"from": Author.collection.name,
"localField": "reviews.comments.author",
"foreignField": "_id",
"as": "reviews.comments.author"
}},
{ "$unwind": "$reviews.comments.author" },
{ "$addFields": {
"reviews.comments.author.isFollower": {
"$in": [
mongoose.Types.ObjectId(req.user.id),
"$reviews.comments.author.followers"
]
}
}},
{ "$group": {
"_id": {
"_id": "$_id",
"reviewId": "$review._id"
},
"name": { "$first": "$name" },
"addedBy": { "$first": "$addedBy" },
"review": {
"$first": {
"_id": "$review._id",
"createdAt": "$review.createdAt",
"venue": "$review.venue",
"author": "$review.author",
"content": "$review.content"
}
},
"comments": { "$push": "$reviews.comments" }
}},
{ "$sort": { "_id._id": 1, "review.createdAt": -1 } },
{ "$group": {
"_id": "$_id._id",
"name": { "$first": "$name" },
"addedBy": { "$first": "$addedBy" },
"reviews": {
"$push": {
"_id": "$review._id",
"venue": "$review.venue",
"author": "$review.author",
"content": "$review.content",
"comments": "$comments"
}
}
}}
])


This really is not as daunting as you might think at first and follows a simple pattern of $lookup and $unwind as you progress through each array.



The "author" detail of course is singular, so once that is "unwound" you simply want to leave it that way, make the field addition and start the process of "rolling back" into the arrays.



There are only two levels to reconstruct back to the original Venue document, so the first detail level is by Review to rebuild the "comments" array. All you need to is to $push the path of "$reviews.comments" in order to collect these, and as long as the "$reviews._id" field is in the "grouping _id" the only other things you need to keep are all the other fields. You can put all of these into the _id as well, or you can use $first.



With that done there is only one more $group stage in order to get back to Venue itself. This time the grouping key is "$_id" of course, with all properties of the venue itself using $first and the remaining "$review" details going back into an array with $push. Of course the "$comments" output from the previous $group becomes the "review.comments" path.



Working on a single document and it's relations, this is not really so bad. The $unwind pipeline operator can generally be a performance issue, but in the context of this usage it should not really cause that much of an impact.



Since the data is still being "joined on the server" there is still far less traffic than the other remaining alternative.



JavaScript Manipulation



Of course the other case here is that instead of changing data on the server itself, you actually manipulate the result. In most cases I would be in favor of this approach since any "additions" to the data are probably best handled on the client.



The problem of course with using populate() is that whilst it may 'look like' a much more simplified process, it is in fact NOT A JOIN in any way. All populate() actually does is "hide" the underlying process of submitting multiple queries to the database, and then awaiting the results through async handling.



So the "appearance" of a join is actually the result of multiple requests to the server and then doing "client side manipulation" of the data to embed the details within arrays.



So aside from that clear warning that the performance characteristics are nowhere close to being on par with a server $lookup, the other caveat is of course that the "mongoose Documents" in the result are not actually plain JavaScript objects subject to further manipulation.



So in order to take this approach, you need to add the .lean() method to the query before execution, in order to instruct mongoose to return "plain JavaScript objects" instead of Document types which are cast with schema methods attached to the model. Noting of course that the resulting data no longer has access to any "instance methods" that would otherwise be associated with the related models themselves:



let venue = await Venue.findOne({ _id: id.id })
.populate({
path: 'reviews',
options: { sort: { createdAt: -1 } },
populate: [
{ path: 'comments', populate: [{ path: 'author' }] }
]
})
.lean();


Now venue is a plain object, we can simply process and adjust as needed:



venue.reviews = venue.reviews.map( r => 
({
...r,
comments: r.comments.map( c =>
({
...c,
author: {
...c.author,
isAuthor: c.author.followers.map( f => f.toString() ).indexOf(req.user.id) != -1
}
})
)
})
);


So it's really just a matter of cycling through each of the inner arrays down until the level where you can see the followers array within the author details. The comparison then can be made against the ObjectId values stored in that array after first using .map() to return the "string" values for comparison against the req.user.id which is also a string (if it is not, then also add .toString() on that ), since it is easier in general to compare these values in this way via JavaScript code.



Again though I need to stress that it "looks simple" but it is in fact the sort of thing you really want to avoid for system performance, as those additional queries and the transfer between the server and the client cost a lot in time of processing and even due to the request overhead this adds up to real costs in transport between hosting providers.





Summary



Those are basically your approaches you can take, short of "rolling your own" where you actually perform the "multiple queries" to the database yourself instead of using the helper that .populate() is.



Using the populate output, you can then simply manipulate the data in result just like any other data structure, as long as you apply .lean() to the query to convert or otherwise extract the plain object data from the mongoose documents returned.



Whilst the aggregate approaches look far more involved, there are "a lot" more advantages to doing this work on the server. Larger result sets can be sorted, calculations can be done for further filtering, and of course you get a "single response" to a "single request" made to the server, all with no additional overhead.



It is totally arguable that the pipelines themselves could simply be constructed based on attributes already stored on the schema. So writing your own method to perform this "construction" based on the attached schema should not be too difficult.



In the longer term of course $lookup is the better solution, but you'll probably need to put a little more work into the initial coding, if of course you don't just simply copy from what is listed here ;)






share|improve this answer
























  • Perfect man... You are magician... It took 3 days to properly understand the query... I think after nested lookup the populate queries will become extinct from mongodb... Thanks man .... Awesome

    – Anthony Winzlet
    Apr 23 '18 at 16:46








  • 1





    @AshishChoudhary If you have a new question then please Ask a New Question instead of asking for more details on an existing answer. That is how questions work on this site. Of course aggregation pipelines have $project which you can basically use anywhere and especially where $addFields is used in example here as that pipeline stage "adds" to the existing fields where with $project you specify explicitly which fields to return. If you need to ask more about that then you know what to do.

    – Neil Lunn
    May 9 '18 at 8:15











  • Awesome answer :)

    – Petrov
    Oct 27 '18 at 19:36
















12














There are a couple of approaches of course depending on your available MongoDB version. These vary from different usages of $lookup through to enabling object manipulation on the .populate() result via .lean().



I do ask that you read the sections carefully, and be aware that all may not be as it seems when considering your implementation solution.



MongoDB 3.6, "nested" $lookup



With MongoDB 3.6 the $lookup operator gets the additional ability to include a pipeline expression as opposed to simply joining a "local" to "foreign" key value, what this means is you can essentially do each $lookup as "nested" within these pipeline expressions



Venue.aggregate([
{ "$match": { "_id": mongoose.Types.ObjectId(id.id) } },
{ "$lookup": {
"from": Review.collection.name,
"let": { "reviews": "$reviews" },
"pipeline": [
{ "$match": { "$expr": { "$in": [ "$_id", "$$reviews" ] } } },
{ "$lookup": {
"from": Comment.collection.name,
"let": { "comments": "$comments" },
"pipeline": [
{ "$match": { "$expr": { "$in": [ "$_id", "$$comments" ] } } },
{ "$lookup": {
"from": Author.collection.name,
"let": { "author": "$author" },
"pipeline": [
{ "$match": { "$expr": { "$eq": [ "$_id", "$$author" ] } } },
{ "$addFields": {
"isFollower": {
"$in": [
mongoose.Types.ObjectId(req.user.id),
"$followers"
]
}
}}
],
"as": "author"
}},
{ "$addFields": {
"author": { "$arrayElemAt": [ "$author", 0 ] }
}}
],
"as": "comments"
}},
{ "$sort": { "createdAt": -1 } }
],
"as": "reviews"
}},
])


This can be really quite powerful, as you see from the perspective of the original pipeline, it really only knows about adding content to the "reviews" array and then each subsequent "nested" pipeline expression also only ever sees it's "inner" elements from the join.



It is powerful and in some respects it may be a bit clearer as all field paths are relative to the nesting level, but it does start that indentation creep in the BSON structure, and you do need to be aware of whether you are matching to arrays or singular values in traversing the structure.



Note we can also do things here like "flattening the author property" as seen within the "comments" array entries. All $lookup target output may be an "array", but within a "sub-pipeline" we can re-shape that single element array into just a single value.



Standard MongoDB $lookup



Still keeping the "join on the server" you can actually do it with $lookup, but it just takes intermediate processing. This is the long standing approach with deconstructing an array with $unwind and the using $group stages to rebuild arrays:



Venue.aggregate([
{ "$match": { "_id": mongoose.Types.ObjectId(id.id) } },
{ "$lookup": {
"from": Review.collection.name,
"localField": "reviews",
"foreignField": "_id",
"as": "reviews"
}},
{ "$unwind": "$reviews" },
{ "$lookup": {
"from": Comment.collection.name,
"localField": "reviews.comments",
"foreignField": "_id",
"as": "reviews.comments",
}},
{ "$unwind": "$reviews.comments" },
{ "$lookup": {
"from": Author.collection.name,
"localField": "reviews.comments.author",
"foreignField": "_id",
"as": "reviews.comments.author"
}},
{ "$unwind": "$reviews.comments.author" },
{ "$addFields": {
"reviews.comments.author.isFollower": {
"$in": [
mongoose.Types.ObjectId(req.user.id),
"$reviews.comments.author.followers"
]
}
}},
{ "$group": {
"_id": {
"_id": "$_id",
"reviewId": "$review._id"
},
"name": { "$first": "$name" },
"addedBy": { "$first": "$addedBy" },
"review": {
"$first": {
"_id": "$review._id",
"createdAt": "$review.createdAt",
"venue": "$review.venue",
"author": "$review.author",
"content": "$review.content"
}
},
"comments": { "$push": "$reviews.comments" }
}},
{ "$sort": { "_id._id": 1, "review.createdAt": -1 } },
{ "$group": {
"_id": "$_id._id",
"name": { "$first": "$name" },
"addedBy": { "$first": "$addedBy" },
"reviews": {
"$push": {
"_id": "$review._id",
"venue": "$review.venue",
"author": "$review.author",
"content": "$review.content",
"comments": "$comments"
}
}
}}
])


This really is not as daunting as you might think at first and follows a simple pattern of $lookup and $unwind as you progress through each array.



The "author" detail of course is singular, so once that is "unwound" you simply want to leave it that way, make the field addition and start the process of "rolling back" into the arrays.



There are only two levels to reconstruct back to the original Venue document, so the first detail level is by Review to rebuild the "comments" array. All you need to is to $push the path of "$reviews.comments" in order to collect these, and as long as the "$reviews._id" field is in the "grouping _id" the only other things you need to keep are all the other fields. You can put all of these into the _id as well, or you can use $first.



With that done there is only one more $group stage in order to get back to Venue itself. This time the grouping key is "$_id" of course, with all properties of the venue itself using $first and the remaining "$review" details going back into an array with $push. Of course the "$comments" output from the previous $group becomes the "review.comments" path.



Working on a single document and it's relations, this is not really so bad. The $unwind pipeline operator can generally be a performance issue, but in the context of this usage it should not really cause that much of an impact.



Since the data is still being "joined on the server" there is still far less traffic than the other remaining alternative.



JavaScript Manipulation



Of course the other case here is that instead of changing data on the server itself, you actually manipulate the result. In most cases I would be in favor of this approach since any "additions" to the data are probably best handled on the client.



The problem of course with using populate() is that whilst it may 'look like' a much more simplified process, it is in fact NOT A JOIN in any way. All populate() actually does is "hide" the underlying process of submitting multiple queries to the database, and then awaiting the results through async handling.



So the "appearance" of a join is actually the result of multiple requests to the server and then doing "client side manipulation" of the data to embed the details within arrays.



So aside from that clear warning that the performance characteristics are nowhere close to being on par with a server $lookup, the other caveat is of course that the "mongoose Documents" in the result are not actually plain JavaScript objects subject to further manipulation.



So in order to take this approach, you need to add the .lean() method to the query before execution, in order to instruct mongoose to return "plain JavaScript objects" instead of Document types which are cast with schema methods attached to the model. Noting of course that the resulting data no longer has access to any "instance methods" that would otherwise be associated with the related models themselves:



let venue = await Venue.findOne({ _id: id.id })
.populate({
path: 'reviews',
options: { sort: { createdAt: -1 } },
populate: [
{ path: 'comments', populate: [{ path: 'author' }] }
]
})
.lean();


Now venue is a plain object, we can simply process and adjust as needed:



venue.reviews = venue.reviews.map( r => 
({
...r,
comments: r.comments.map( c =>
({
...c,
author: {
...c.author,
isAuthor: c.author.followers.map( f => f.toString() ).indexOf(req.user.id) != -1
}
})
)
})
);


So it's really just a matter of cycling through each of the inner arrays down until the level where you can see the followers array within the author details. The comparison then can be made against the ObjectId values stored in that array after first using .map() to return the "string" values for comparison against the req.user.id which is also a string (if it is not, then also add .toString() on that ), since it is easier in general to compare these values in this way via JavaScript code.



Again though I need to stress that it "looks simple" but it is in fact the sort of thing you really want to avoid for system performance, as those additional queries and the transfer between the server and the client cost a lot in time of processing and even due to the request overhead this adds up to real costs in transport between hosting providers.





Summary



Those are basically your approaches you can take, short of "rolling your own" where you actually perform the "multiple queries" to the database yourself instead of using the helper that .populate() is.



Using the populate output, you can then simply manipulate the data in result just like any other data structure, as long as you apply .lean() to the query to convert or otherwise extract the plain object data from the mongoose documents returned.



Whilst the aggregate approaches look far more involved, there are "a lot" more advantages to doing this work on the server. Larger result sets can be sorted, calculations can be done for further filtering, and of course you get a "single response" to a "single request" made to the server, all with no additional overhead.



It is totally arguable that the pipelines themselves could simply be constructed based on attributes already stored on the schema. So writing your own method to perform this "construction" based on the attached schema should not be too difficult.



In the longer term of course $lookup is the better solution, but you'll probably need to put a little more work into the initial coding, if of course you don't just simply copy from what is listed here ;)






share|improve this answer
























  • Perfect man... You are magician... It took 3 days to properly understand the query... I think after nested lookup the populate queries will become extinct from mongodb... Thanks man .... Awesome

    – Anthony Winzlet
    Apr 23 '18 at 16:46








  • 1





    @AshishChoudhary If you have a new question then please Ask a New Question instead of asking for more details on an existing answer. That is how questions work on this site. Of course aggregation pipelines have $project which you can basically use anywhere and especially where $addFields is used in example here as that pipeline stage "adds" to the existing fields where with $project you specify explicitly which fields to return. If you need to ask more about that then you know what to do.

    – Neil Lunn
    May 9 '18 at 8:15











  • Awesome answer :)

    – Petrov
    Oct 27 '18 at 19:36














12












12








12







There are a couple of approaches of course depending on your available MongoDB version. These vary from different usages of $lookup through to enabling object manipulation on the .populate() result via .lean().



I do ask that you read the sections carefully, and be aware that all may not be as it seems when considering your implementation solution.



MongoDB 3.6, "nested" $lookup



With MongoDB 3.6 the $lookup operator gets the additional ability to include a pipeline expression as opposed to simply joining a "local" to "foreign" key value, what this means is you can essentially do each $lookup as "nested" within these pipeline expressions



Venue.aggregate([
{ "$match": { "_id": mongoose.Types.ObjectId(id.id) } },
{ "$lookup": {
"from": Review.collection.name,
"let": { "reviews": "$reviews" },
"pipeline": [
{ "$match": { "$expr": { "$in": [ "$_id", "$$reviews" ] } } },
{ "$lookup": {
"from": Comment.collection.name,
"let": { "comments": "$comments" },
"pipeline": [
{ "$match": { "$expr": { "$in": [ "$_id", "$$comments" ] } } },
{ "$lookup": {
"from": Author.collection.name,
"let": { "author": "$author" },
"pipeline": [
{ "$match": { "$expr": { "$eq": [ "$_id", "$$author" ] } } },
{ "$addFields": {
"isFollower": {
"$in": [
mongoose.Types.ObjectId(req.user.id),
"$followers"
]
}
}}
],
"as": "author"
}},
{ "$addFields": {
"author": { "$arrayElemAt": [ "$author", 0 ] }
}}
],
"as": "comments"
}},
{ "$sort": { "createdAt": -1 } }
],
"as": "reviews"
}},
])


This can be really quite powerful, as you see from the perspective of the original pipeline, it really only knows about adding content to the "reviews" array and then each subsequent "nested" pipeline expression also only ever sees it's "inner" elements from the join.



It is powerful and in some respects it may be a bit clearer as all field paths are relative to the nesting level, but it does start that indentation creep in the BSON structure, and you do need to be aware of whether you are matching to arrays or singular values in traversing the structure.



Note we can also do things here like "flattening the author property" as seen within the "comments" array entries. All $lookup target output may be an "array", but within a "sub-pipeline" we can re-shape that single element array into just a single value.



Standard MongoDB $lookup



Still keeping the "join on the server" you can actually do it with $lookup, but it just takes intermediate processing. This is the long standing approach with deconstructing an array with $unwind and the using $group stages to rebuild arrays:



Venue.aggregate([
{ "$match": { "_id": mongoose.Types.ObjectId(id.id) } },
{ "$lookup": {
"from": Review.collection.name,
"localField": "reviews",
"foreignField": "_id",
"as": "reviews"
}},
{ "$unwind": "$reviews" },
{ "$lookup": {
"from": Comment.collection.name,
"localField": "reviews.comments",
"foreignField": "_id",
"as": "reviews.comments",
}},
{ "$unwind": "$reviews.comments" },
{ "$lookup": {
"from": Author.collection.name,
"localField": "reviews.comments.author",
"foreignField": "_id",
"as": "reviews.comments.author"
}},
{ "$unwind": "$reviews.comments.author" },
{ "$addFields": {
"reviews.comments.author.isFollower": {
"$in": [
mongoose.Types.ObjectId(req.user.id),
"$reviews.comments.author.followers"
]
}
}},
{ "$group": {
"_id": {
"_id": "$_id",
"reviewId": "$review._id"
},
"name": { "$first": "$name" },
"addedBy": { "$first": "$addedBy" },
"review": {
"$first": {
"_id": "$review._id",
"createdAt": "$review.createdAt",
"venue": "$review.venue",
"author": "$review.author",
"content": "$review.content"
}
},
"comments": { "$push": "$reviews.comments" }
}},
{ "$sort": { "_id._id": 1, "review.createdAt": -1 } },
{ "$group": {
"_id": "$_id._id",
"name": { "$first": "$name" },
"addedBy": { "$first": "$addedBy" },
"reviews": {
"$push": {
"_id": "$review._id",
"venue": "$review.venue",
"author": "$review.author",
"content": "$review.content",
"comments": "$comments"
}
}
}}
])


This really is not as daunting as you might think at first and follows a simple pattern of $lookup and $unwind as you progress through each array.



The "author" detail of course is singular, so once that is "unwound" you simply want to leave it that way, make the field addition and start the process of "rolling back" into the arrays.



There are only two levels to reconstruct back to the original Venue document, so the first detail level is by Review to rebuild the "comments" array. All you need to is to $push the path of "$reviews.comments" in order to collect these, and as long as the "$reviews._id" field is in the "grouping _id" the only other things you need to keep are all the other fields. You can put all of these into the _id as well, or you can use $first.



With that done there is only one more $group stage in order to get back to Venue itself. This time the grouping key is "$_id" of course, with all properties of the venue itself using $first and the remaining "$review" details going back into an array with $push. Of course the "$comments" output from the previous $group becomes the "review.comments" path.



Working on a single document and it's relations, this is not really so bad. The $unwind pipeline operator can generally be a performance issue, but in the context of this usage it should not really cause that much of an impact.



Since the data is still being "joined on the server" there is still far less traffic than the other remaining alternative.



JavaScript Manipulation



Of course the other case here is that instead of changing data on the server itself, you actually manipulate the result. In most cases I would be in favor of this approach since any "additions" to the data are probably best handled on the client.



The problem of course with using populate() is that whilst it may 'look like' a much more simplified process, it is in fact NOT A JOIN in any way. All populate() actually does is "hide" the underlying process of submitting multiple queries to the database, and then awaiting the results through async handling.



So the "appearance" of a join is actually the result of multiple requests to the server and then doing "client side manipulation" of the data to embed the details within arrays.



So aside from that clear warning that the performance characteristics are nowhere close to being on par with a server $lookup, the other caveat is of course that the "mongoose Documents" in the result are not actually plain JavaScript objects subject to further manipulation.



So in order to take this approach, you need to add the .lean() method to the query before execution, in order to instruct mongoose to return "plain JavaScript objects" instead of Document types which are cast with schema methods attached to the model. Noting of course that the resulting data no longer has access to any "instance methods" that would otherwise be associated with the related models themselves:



let venue = await Venue.findOne({ _id: id.id })
.populate({
path: 'reviews',
options: { sort: { createdAt: -1 } },
populate: [
{ path: 'comments', populate: [{ path: 'author' }] }
]
})
.lean();


Now venue is a plain object, we can simply process and adjust as needed:



venue.reviews = venue.reviews.map( r => 
({
...r,
comments: r.comments.map( c =>
({
...c,
author: {
...c.author,
isAuthor: c.author.followers.map( f => f.toString() ).indexOf(req.user.id) != -1
}
})
)
})
);


So it's really just a matter of cycling through each of the inner arrays down until the level where you can see the followers array within the author details. The comparison then can be made against the ObjectId values stored in that array after first using .map() to return the "string" values for comparison against the req.user.id which is also a string (if it is not, then also add .toString() on that ), since it is easier in general to compare these values in this way via JavaScript code.



Again though I need to stress that it "looks simple" but it is in fact the sort of thing you really want to avoid for system performance, as those additional queries and the transfer between the server and the client cost a lot in time of processing and even due to the request overhead this adds up to real costs in transport between hosting providers.





Summary



Those are basically your approaches you can take, short of "rolling your own" where you actually perform the "multiple queries" to the database yourself instead of using the helper that .populate() is.



Using the populate output, you can then simply manipulate the data in result just like any other data structure, as long as you apply .lean() to the query to convert or otherwise extract the plain object data from the mongoose documents returned.



Whilst the aggregate approaches look far more involved, there are "a lot" more advantages to doing this work on the server. Larger result sets can be sorted, calculations can be done for further filtering, and of course you get a "single response" to a "single request" made to the server, all with no additional overhead.



It is totally arguable that the pipelines themselves could simply be constructed based on attributes already stored on the schema. So writing your own method to perform this "construction" based on the attached schema should not be too difficult.



In the longer term of course $lookup is the better solution, but you'll probably need to put a little more work into the initial coding, if of course you don't just simply copy from what is listed here ;)






share|improve this answer













There are a couple of approaches of course depending on your available MongoDB version. These vary from different usages of $lookup through to enabling object manipulation on the .populate() result via .lean().



I do ask that you read the sections carefully, and be aware that all may not be as it seems when considering your implementation solution.



MongoDB 3.6, "nested" $lookup



With MongoDB 3.6 the $lookup operator gets the additional ability to include a pipeline expression as opposed to simply joining a "local" to "foreign" key value, what this means is you can essentially do each $lookup as "nested" within these pipeline expressions



Venue.aggregate([
{ "$match": { "_id": mongoose.Types.ObjectId(id.id) } },
{ "$lookup": {
"from": Review.collection.name,
"let": { "reviews": "$reviews" },
"pipeline": [
{ "$match": { "$expr": { "$in": [ "$_id", "$$reviews" ] } } },
{ "$lookup": {
"from": Comment.collection.name,
"let": { "comments": "$comments" },
"pipeline": [
{ "$match": { "$expr": { "$in": [ "$_id", "$$comments" ] } } },
{ "$lookup": {
"from": Author.collection.name,
"let": { "author": "$author" },
"pipeline": [
{ "$match": { "$expr": { "$eq": [ "$_id", "$$author" ] } } },
{ "$addFields": {
"isFollower": {
"$in": [
mongoose.Types.ObjectId(req.user.id),
"$followers"
]
}
}}
],
"as": "author"
}},
{ "$addFields": {
"author": { "$arrayElemAt": [ "$author", 0 ] }
}}
],
"as": "comments"
}},
{ "$sort": { "createdAt": -1 } }
],
"as": "reviews"
}},
])


This can be really quite powerful, as you see from the perspective of the original pipeline, it really only knows about adding content to the "reviews" array and then each subsequent "nested" pipeline expression also only ever sees it's "inner" elements from the join.



It is powerful and in some respects it may be a bit clearer as all field paths are relative to the nesting level, but it does start that indentation creep in the BSON structure, and you do need to be aware of whether you are matching to arrays or singular values in traversing the structure.



Note we can also do things here like "flattening the author property" as seen within the "comments" array entries. All $lookup target output may be an "array", but within a "sub-pipeline" we can re-shape that single element array into just a single value.



Standard MongoDB $lookup



Still keeping the "join on the server" you can actually do it with $lookup, but it just takes intermediate processing. This is the long standing approach with deconstructing an array with $unwind and the using $group stages to rebuild arrays:



Venue.aggregate([
{ "$match": { "_id": mongoose.Types.ObjectId(id.id) } },
{ "$lookup": {
"from": Review.collection.name,
"localField": "reviews",
"foreignField": "_id",
"as": "reviews"
}},
{ "$unwind": "$reviews" },
{ "$lookup": {
"from": Comment.collection.name,
"localField": "reviews.comments",
"foreignField": "_id",
"as": "reviews.comments",
}},
{ "$unwind": "$reviews.comments" },
{ "$lookup": {
"from": Author.collection.name,
"localField": "reviews.comments.author",
"foreignField": "_id",
"as": "reviews.comments.author"
}},
{ "$unwind": "$reviews.comments.author" },
{ "$addFields": {
"reviews.comments.author.isFollower": {
"$in": [
mongoose.Types.ObjectId(req.user.id),
"$reviews.comments.author.followers"
]
}
}},
{ "$group": {
"_id": {
"_id": "$_id",
"reviewId": "$review._id"
},
"name": { "$first": "$name" },
"addedBy": { "$first": "$addedBy" },
"review": {
"$first": {
"_id": "$review._id",
"createdAt": "$review.createdAt",
"venue": "$review.venue",
"author": "$review.author",
"content": "$review.content"
}
},
"comments": { "$push": "$reviews.comments" }
}},
{ "$sort": { "_id._id": 1, "review.createdAt": -1 } },
{ "$group": {
"_id": "$_id._id",
"name": { "$first": "$name" },
"addedBy": { "$first": "$addedBy" },
"reviews": {
"$push": {
"_id": "$review._id",
"venue": "$review.venue",
"author": "$review.author",
"content": "$review.content",
"comments": "$comments"
}
}
}}
])


This really is not as daunting as you might think at first and follows a simple pattern of $lookup and $unwind as you progress through each array.



The "author" detail of course is singular, so once that is "unwound" you simply want to leave it that way, make the field addition and start the process of "rolling back" into the arrays.



There are only two levels to reconstruct back to the original Venue document, so the first detail level is by Review to rebuild the "comments" array. All you need to is to $push the path of "$reviews.comments" in order to collect these, and as long as the "$reviews._id" field is in the "grouping _id" the only other things you need to keep are all the other fields. You can put all of these into the _id as well, or you can use $first.



With that done there is only one more $group stage in order to get back to Venue itself. This time the grouping key is "$_id" of course, with all properties of the venue itself using $first and the remaining "$review" details going back into an array with $push. Of course the "$comments" output from the previous $group becomes the "review.comments" path.



Working on a single document and it's relations, this is not really so bad. The $unwind pipeline operator can generally be a performance issue, but in the context of this usage it should not really cause that much of an impact.



Since the data is still being "joined on the server" there is still far less traffic than the other remaining alternative.



JavaScript Manipulation



Of course the other case here is that instead of changing data on the server itself, you actually manipulate the result. In most cases I would be in favor of this approach since any "additions" to the data are probably best handled on the client.



The problem of course with using populate() is that whilst it may 'look like' a much more simplified process, it is in fact NOT A JOIN in any way. All populate() actually does is "hide" the underlying process of submitting multiple queries to the database, and then awaiting the results through async handling.



So the "appearance" of a join is actually the result of multiple requests to the server and then doing "client side manipulation" of the data to embed the details within arrays.



So aside from that clear warning that the performance characteristics are nowhere close to being on par with a server $lookup, the other caveat is of course that the "mongoose Documents" in the result are not actually plain JavaScript objects subject to further manipulation.



So in order to take this approach, you need to add the .lean() method to the query before execution, in order to instruct mongoose to return "plain JavaScript objects" instead of Document types which are cast with schema methods attached to the model. Noting of course that the resulting data no longer has access to any "instance methods" that would otherwise be associated with the related models themselves:



let venue = await Venue.findOne({ _id: id.id })
.populate({
path: 'reviews',
options: { sort: { createdAt: -1 } },
populate: [
{ path: 'comments', populate: [{ path: 'author' }] }
]
})
.lean();


Now venue is a plain object, we can simply process and adjust as needed:



venue.reviews = venue.reviews.map( r => 
({
...r,
comments: r.comments.map( c =>
({
...c,
author: {
...c.author,
isAuthor: c.author.followers.map( f => f.toString() ).indexOf(req.user.id) != -1
}
})
)
})
);


So it's really just a matter of cycling through each of the inner arrays down until the level where you can see the followers array within the author details. The comparison then can be made against the ObjectId values stored in that array after first using .map() to return the "string" values for comparison against the req.user.id which is also a string (if it is not, then also add .toString() on that ), since it is easier in general to compare these values in this way via JavaScript code.



Again though I need to stress that it "looks simple" but it is in fact the sort of thing you really want to avoid for system performance, as those additional queries and the transfer between the server and the client cost a lot in time of processing and even due to the request overhead this adds up to real costs in transport between hosting providers.





Summary



Those are basically your approaches you can take, short of "rolling your own" where you actually perform the "multiple queries" to the database yourself instead of using the helper that .populate() is.



Using the populate output, you can then simply manipulate the data in result just like any other data structure, as long as you apply .lean() to the query to convert or otherwise extract the plain object data from the mongoose documents returned.



Whilst the aggregate approaches look far more involved, there are "a lot" more advantages to doing this work on the server. Larger result sets can be sorted, calculations can be done for further filtering, and of course you get a "single response" to a "single request" made to the server, all with no additional overhead.



It is totally arguable that the pipelines themselves could simply be constructed based on attributes already stored on the schema. So writing your own method to perform this "construction" based on the attached schema should not be too difficult.



In the longer term of course $lookup is the better solution, but you'll probably need to put a little more work into the initial coding, if of course you don't just simply copy from what is listed here ;)







share|improve this answer












share|improve this answer



share|improve this answer










answered Apr 21 '18 at 10:57









Neil LunnNeil Lunn

100k23177187




100k23177187













  • Perfect man... You are magician... It took 3 days to properly understand the query... I think after nested lookup the populate queries will become extinct from mongodb... Thanks man .... Awesome

    – Anthony Winzlet
    Apr 23 '18 at 16:46








  • 1





    @AshishChoudhary If you have a new question then please Ask a New Question instead of asking for more details on an existing answer. That is how questions work on this site. Of course aggregation pipelines have $project which you can basically use anywhere and especially where $addFields is used in example here as that pipeline stage "adds" to the existing fields where with $project you specify explicitly which fields to return. If you need to ask more about that then you know what to do.

    – Neil Lunn
    May 9 '18 at 8:15











  • Awesome answer :)

    – Petrov
    Oct 27 '18 at 19:36



















  • Perfect man... You are magician... It took 3 days to properly understand the query... I think after nested lookup the populate queries will become extinct from mongodb... Thanks man .... Awesome

    – Anthony Winzlet
    Apr 23 '18 at 16:46








  • 1





    @AshishChoudhary If you have a new question then please Ask a New Question instead of asking for more details on an existing answer. That is how questions work on this site. Of course aggregation pipelines have $project which you can basically use anywhere and especially where $addFields is used in example here as that pipeline stage "adds" to the existing fields where with $project you specify explicitly which fields to return. If you need to ask more about that then you know what to do.

    – Neil Lunn
    May 9 '18 at 8:15











  • Awesome answer :)

    – Petrov
    Oct 27 '18 at 19:36

















Perfect man... You are magician... It took 3 days to properly understand the query... I think after nested lookup the populate queries will become extinct from mongodb... Thanks man .... Awesome

– Anthony Winzlet
Apr 23 '18 at 16:46







Perfect man... You are magician... It took 3 days to properly understand the query... I think after nested lookup the populate queries will become extinct from mongodb... Thanks man .... Awesome

– Anthony Winzlet
Apr 23 '18 at 16:46






1




1





@AshishChoudhary If you have a new question then please Ask a New Question instead of asking for more details on an existing answer. That is how questions work on this site. Of course aggregation pipelines have $project which you can basically use anywhere and especially where $addFields is used in example here as that pipeline stage "adds" to the existing fields where with $project you specify explicitly which fields to return. If you need to ask more about that then you know what to do.

– Neil Lunn
May 9 '18 at 8:15





@AshishChoudhary If you have a new question then please Ask a New Question instead of asking for more details on an existing answer. That is how questions work on this site. Of course aggregation pipelines have $project which you can basically use anywhere and especially where $addFields is used in example here as that pipeline stage "adds" to the existing fields where with $project you specify explicitly which fields to return. If you need to ask more about that then you know what to do.

– Neil Lunn
May 9 '18 at 8:15













Awesome answer :)

– Petrov
Oct 27 '18 at 19:36





Awesome answer :)

– Petrov
Oct 27 '18 at 19:36




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f49953780%2flookup-multiple-levels-without-unwind%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Create new schema in PostgreSQL using DBeaver

Deepest pit of an array with Javascript: test on Codility

Costa Masnaga