lucemia lucemia - 1 month ago 16
Python Question

google app engine datastore performance bad while data entry has more property

Version1

class ActionLog(db.Model):
action = db.StringProperty()
time_slice = db.IntegerProperty()
trace_code = db.StringProperty() # which profile this log belong to

# Who
facebook_id = db.StringProperty() # the user's facebook id
ip = db.StringProperty() # the user's ip address

# When
time = db.DateTimeProperty(auto_now_add=True) # the time of this event

# What
url = db.StringProperty() # the imgurl
secret = db.StringProperty() # the secret of imgurl instance
tag = db.StringProperty() # the tag
referurl = db.StringProperty() # the tag's link

# Where
weburl = db.StringProperty() # the user's refer url
domain = db.StringProperty() # the refer url's domain
BSP = db.StringProperty() # the refer url's BSP

#execute
log = ActionLog(action=action,
trace_code=trace_code,
facebook_id=facebook_id,
ip=ip,
time_slice=time_slice,
url=url,
secret=secret,
tag=tag,
referurl=referurl,
weburl=weburl,
domain=domain,
BSP=BSP)

db.put(log)


Version 2

class ActionLog(db.Model):
trace_code = db.StringProperty()
url = db.StringProperty()
secret = db.StringProperty()

# use a dict like text property to store all implicit properties.
desp = MyDictProperty()
time = db.DateTimeProperty(auto_now_add=True) # the time of this event

#execute
log = ActionLog(
secret = secret,
url = url,
trace_code = trace_code,
desp = {
'action':action,
'facebook_id':facebook_id,
'ip':ip,
'tag':tag,
'referurl':referurl,
'weburl':weburl,
}
)

db.put(log)


These two versions of code basically do the same task, however, the version 1 code will use more than 800ms to perform a simple put operation (a yellow or red light) CPU time on google app engine. In the contract, the version 2 code only use about 300ms. (Both test on HRD datastore)

On M/S Datastore, the version 1 code will use about 400ms and version 2 code will use about 150ms.

I can image that the version 1 will be slower compare to version 2, since it use more key index. However, it is hard to believe that the difference is so huge. It is also surprising that Google app engine cannot handle such a easy task.

Does that mean we cannot expect GAE to perform insert on data with more than 10 properties
or do I misunderstand anything?

thx

Answer

Set index=False on all properties that you don't need indexed (i.e., properties that you won't use in a query). This cuts down the number of index writes it takes to save an entity.

See http://code.google.com/appengine/docs/python/datastore/queries.html#Introduction_to_Indexes for an explanation.