Kekito Kekito - 2 months ago 20
Python Question

GAE/P: Efficiently fetching large number of entities by key

In app engine, you can query entities like this:

for x in MyEntity.query().iter():
x.do_something()


When you do this, the ndb code takes care of efficiently fetching entities in batches to minimize round trips to the data store.

In my situation, I would like to do the same efficient batching, but I already have a list of keys, so I might as well use them to avoid the slower queries. I would like to do this:

for x in iter_entities(key_list):
x.do_something()


Where the function
iter_entities()
will fetch entities in batches as I need them. It isn't too hard to write this myself, but I probably can't do it as good as the great folks at Google, and why reinvent the wheel if I don't need to!

Is there a way to write a function
iter_entities()
that is built on top of the ndb iterator?

Answer

If you use async tasklets for your individual-entity processing, then NDB will take care of batching gets, something like this should work:

@ndb.tasklet
def do_something(key):
  x = yield key.get_async()
  x.do_something()

futs = []
for key in key_list:
  futs.append(do_something(key))

ndb.Future.wait_all(futs)