B B B B - 1 month ago 6
Python Question

How to make sure that app engine ndb root entity is saved before proceeding with rest of code

I am currently trying to create a CustomUser entity in my app engine project upon a user signing in for the first time. I would like CustomUser entities to be unique, and I would like to prevent the same entity from being created more than once. This would be fairly easy to do if I can supply it with an ancestor upon entity creation, as this will make the transaction strongly consistent.

Unfortunately, this is not the case, due to the fact that a CustomUser entity is a root entity, and it will thus be eventually consistent, not strongly consistent. Because of this, there are instances when the entity is created twice, which I would like to prevent as this will cause problems later on.

So the question is, is there a way I can prevent the entity from being created more than once? Or at least make the commit of the ancestor entity strongly consistent to prevent duplication? Here's my code, and interim (hacky) solution.

import time
import logging
from google.appengine.ext import ndb

# sample Model
class CustomUser(ndb.Model):
user_id = ndb.StringProperty(required=True)
some_data = ndb.StringProperty(required=True)
some_more_data = ndb.StringProperty(required=True)


externally_based_user_id = "id_taken_from_somewhere_else"
# check if this id already exists in the Model.
# If it does not exist yet, create it
user_entity = CustomUser.query(
CustomUser.user_id == externally_based_user_id,
ancestor=None).get()

if not user_entity:
# prepare the entity
user_entity = CustomUser(
user_id=externally_based_user_id,
some_data="some information",
some_more_data="even more information",
parent=None
)
# write the entity to ndb
user_key = user_entity.put()
# inform of success
logging.info("user " + str(user_key) + " created")

# eventual consistency workaround - loop and keep checking if the
# entity has already been created
#
# I understand that a while loop may not be the wisest solution.
# I can also use a for loop with n range to avoid going around the loop infinitely.
# Both however seem to be band aid solutions
while not entity_check:
entity_check = CustomUser.query(
CustomUser.user_id == externally_based_user_id,
ancestor=None).get()

# time.sleep to prevent the instance from consuming too much processing power and
# memory, although I'm not certain if this has any real effect apart from
# reducing the number of loops
if not entity_check:
time.sleep(0.5)


EDIT: Solution I ended up using based on both of Daniel Roseman's suggestions. This can be further simplified by using get_or_insert as suggested by voscausa. I've stuck to using the usual way of doing things to make things clearer.

import logging
from google.appengine.ext import ndb


# ancestor Model
# we can skip the creation of an empty class like this, and just use a string when
# retrieving a key
class PhantomAncestor(ndb.Model):
pass


# sample Model
class CustomUser(ndb.Model):
# user_id now considered redundance since we will be
# user_id = ndb.StringProperty(required=True)
some_data = ndb.StringProperty(required=True)
some_more_data = ndb.StringProperty(required=True)


externally_based_user_id = "id_taken_from_somewhere_else"

# construct the entity key using information we know.
# entity_key = ndb.Key(*arbitrary ancestor kind*, *arbitrary ancestor id*, *Model*, *user_id*)
# we can also use the string "PhantomAncestor" instead of passing in an empty class like so:
# entity_key = ndb.Key("SomeRandomString", externally_based_user_id, CustomUser, externally_based_user_id)
# check this page on how to construct a key: https://cloud.google.com/appengine/docs/python/ndb/keyclass#Constructors
entity_key = ndb.Key(PhantomAncestor, externally_based_user_id, CustomUser, externally_based_user_id)

# check if this id already exists in the Model.
user_entity = entity_key.get()

# If it does not exist yet, create it
if not user_entity:
# prepare the entity with the desired key value
user_entity = CustomUser(
# user_id=externally_based_user_id,
some_data="some information",
some_more_data="even more information",
parent=None,
# specify the custom key value here
id=externally_based_user_id
)
# write the entity to ndb
user_key = user_entity.put()
# inform of success
logging.info("user " + str(user_key) + " created")

# we should also be able to use CustomUser.get_and_insert to simplify the code further

Answer

A couple of things here.

First, note that the ancestor doesn't have to actually exist. If you want a strongly consistent query, you can use any arbitrary key as an ancestor.

A second option would be to use user_id as your key. Then you can do a key get, rather than a query, which again is strongly consistent.