I'm using ArangoDB for a Web Application through Strongloop.
I've got some performance problem when I run this query:
FOR result IN Collection SORT result.field ASC RETURN result
Summarizing the discussion above:
If there is a skiplist index present on the
field attribute, it could be used for the sort. However, if its created sparse it can't. This can be revalidated by running
in the ArangoShell. If the index is present and non-sparse, then the query should use the index for sorting and no additional sorting will be required - which can be revalidated using Explain. However, the query will still build a huge result in memory which will take time and consume RAM.
If a large result set is desired,
LIMIT can be used to retrieve slices of the results in several chunks, which will cause less stress on the machine.
For example, first iteration:
FOR result IN Collection SORT result.field LIMIT 10000 RETURN result
Then process these first 10,000 documents offline, and note the result value of the last processed document. Now run the query again, but now with an additional FILTER:
FOR result IN Collection FILTER result.field > @lastValue LIMIT 10000 RETURN result
until there are no more documents. That should work fine if
result.field is unique.
result.field is not unique and there are no other unique keys in the collection covered by a skiplist, then the described method will be at least an approximation.
Note also that when splitting the query into chunks this won't provide snapshot isolation, but depending on the use case it may be good enough already.