urlreader urlreader - 1 month ago 8x
C# Question

Good choice to use DocumentDB if needed to update the array in the document?

Is it a good choice to use DocumentDB if needed to update the array in the document? or not?

The boss decided to use DocumentDB in this project. I worked on it and started to think maybe DocumentDB is not a good choice.

We have a webjob, which runs several times every day, it uses an API to get a document (json). This json, includes some fields: id, _ts, ... and it also has an array, which is the historical data (past 30 days):

{"date": "2016-08-01", "value": "100", ....},
{"date": "2016-08-02", "value": "100", ....},
{"date": "2016-08-03", "value": "100", ....},

Originally, we save each of the document, then we realize that we have to combine these documents to get data for more than past 30 days. So, the process is:

1) get the main document, which has all the data, parse it to get the 'key' fields. in this case, it is Date.

2) call the api, get the new data, parse it to get the array. if the 'date' does not exist, insert into #1, if it does, update it in main document.

3) update the main document.

Basically, this more likes using documentDB as SQL server, update the row based on 'key'. One possible issue I can see that is: overtime, the size of the document could be very huge, it means in #1 and #3, we need to parse and update a huge json file. this definitely will slow down the performance. That's why I start to think maybe we should not use documentDB in this case.

Just want to hear opinion from others before mention it to the boss.



You may want to reconsider your design about aggregating individual documents into one big document. First, there's a limit on the maximum size of the document. Looking at DocumentDB quotas, as of today maximum size of a document in a DocumentDB collection can be 512KB.

I would still consider DocumentDB for storing JSON documents (though you would need to consider the cost aspect of it). It has excellent querying support. May be you could create appropriate indexes on your document collection. In that case you wouldn't need to aggregate the data.