Kekito Kekito - 10 months ago 69
Python Question

GAE/P: Efficiently fetching large number of entities by key

In app engine, you can query entities like this:

for x in MyEntity.query().iter():

When you do this, the ndb code takes care of efficiently fetching entities in batches to minimize round trips to the data store.

In my situation, I would like to do the same efficient batching, but I already have a list of keys, so I might as well use them to avoid the slower queries. I would like to do this:

for x in iter_entities(key_list):

Where the function
will fetch entities in batches as I need them. It isn't too hard to write this myself, but I probably can't do it as good as the great folks at Google, and why reinvent the wheel if I don't need to!

Is there a way to write a function
that is built on top of the ndb iterator?

Answer Source

If you use async tasklets for your individual-entity processing, then NDB will take care of batching gets, something like this should work:

def do_something(key):
  x = yield key.get_async()

futs = []
for key in key_list: