Johann Marx Johann Marx - 2 months ago 10
C# Question

MongoDB C# Driver - Return last modified rows only

The data:
The collection contains a list of audit records and I want to return the last modified items from the collection.

For example:

Audit Records

So the query needs to return Audit 1235 and 1237 Only.

The following statement works in Mongo Shell and returns the data sub-millisecond, I just need to also figure out how to return the entire Collection item instead of just the Id.

db.Forms.aggregate(
{ $group: { _id: "$Id", lastModifiedId: { $last: "$_id" } } }
)


However, I need to convert this to the C# Driver's syntax.

I have the following at the moment but it's not working and returns (for lack of a better term) weird data (see screencap under the statement).

var results = collection.Aggregate()
.Group(new BsonDocument { { "_id", "$Id" }, { "lastModifiedId", new BsonDocument("$last", "_id") } })
.ToListAsync().Result.ToList();


Results

My current solution gets the full collection back and then runs it through an extension method to get the latest records (where list is the full collection):

var lastModifiedOnlyList =
from listItem in list.OrderByDescending(_ => _.AuditId)
group listItem by listItem.Id into grp
select grp.OrderByDescending(listItem => listItem.AuditId)
.FirstOrDefault();


While this code works, it is EXTREMELY slow because of the sheer amount of data that is being returned from the collection, so I need to do the grouping on the list as part of the collection get/find.

Please let me know if I can provide any additional information.

Update: With Axel's help I managed to get it resolved:

var pipeline = new[] { new BsonDocument { { "$group", new BsonDocument { { "_id", "$Id" }, { "LastAuditId", new BsonDocument { { "$last", "$_id" } } } } } } };
var lastAuditIds = collection.Aggregate<Audit>(pipeline).ToListAsync().Result.ToList().Select(_=>_.LastAuditId);


I moved that to it's own method and then use the IDs to get the collection items back, with my projection working as well:

var forLastAuditIds = ForLastAuditIds(collection);

var limitedList = (
projection != null
? collection.Find(forLastAuditIds & filter, new FindOptions()).Project(projection)
: collection.Find(forLastAuditIds & filter, new FindOptions())
).ToListAsync().Result.ToList();


"filter" in this case is either an Expression or a BsonDocument. The performance is great as well - sub-second for the whole thing. Thanks for the help, Axel!

Answer

I think you're doing an extra OrderBy, this should do:

var lastModifiedOnlyList =
from listItem in list
group listItem by listItem.Id into grp
select grp.OrderByDescending(listItem => listItem.AuditId)
.FirstOrDefault();

EDIT:

To gain performance in the query, you could use the Aggregate function differently:

var match = new BsonDocument
{
    {
        "$group",
        new BsonDocument
            {
                { "_id", "$Id"  },
                { "lastModifiedId", new BsonDocument
                    {
                        {
                            "$last", "$_id"
                        }
                    }}
            }
    }
};

var pipeline = new[] { match };
var result = collection.Aggregate(pipeline);

That should be the equivalent of your Mongo Shell query.