bryan bryan - 2 months ago 22
PHP Question

How to build datastore indexes (PHP GAE)

I am using Tom Walder's Google Datastore Library for PHP to insert data into my Google App Engine Datastore.

$obj_schema = (new GDS\Schema('Add Log'))
->addString('name', TRUE)
->addDatetime('time', TRUE);
$obj_store = new GDS\Store($obj_gateway, $obj_schema);
$obj_store->upsert($obj_store->createEntity(['name' => "test",'time' => date('Y-m-d H:i:s', time())]));


When I insert data like the above code, everything seems to be importing properly (each property say they are indexed).

enter image description here

But when I go to do a query with multiple selectors it says "You need an index to execute this query".

My query
enter image description here

The error message
enter image description here

Does anyone know what I need to do to make sure my queries are being indexed? This is what my dashboard hows with plenty of data using the code I showed.

enter image description here

Answer

As Alex Martelli mentioned in a comment, most of the time, your indexes are built when you run your app on the devserver and have your datastore get queried there (this adds the required indexes for any question into your index.yaml file.

So you have two ways you can go at it.

1- Run your app on local devserver, go to your dev "Developer Console" to add one or two entities to your datastore. Run your queries, that'll populate your index.yaml with all required indexes. You can then run appcfg.py update_indexes to just deploy your index.yaml (bottom of this page)

2- Your other solution would be to read this, a page on how datastore indexes work. Then read this advanced article on indexes. You should also watch the following presentation that will give you a better insight into indexes and the datastore. Once that's all done, figure out which queries you want, and flesh out the required indexes in your index.yaml, then deploy with the same method as in 1.

Quick summary of how indexes work

So you can think of the datastore as a pure READER. It doesn't, like normal relational databases, do any kind of computation as it reads your data and returns it. Therefore, to be able to run a given query (say "all client orders passed before christmas 2013"), then you need a table where all your client orders are ordered by date (so the system doesn't have to check every row's date to see if it matches. It just takes the first "chunk" of your data, up to the date you're looking for, and returns it).

Therefore, you need to have those indexes built, and they will influence the queries you can run. By default, every attribute is indexed by itself, in descending order. For any queries on more than one attribute (or with a different sort order), you need to have the index (in that case they are called composite indexes) built by the datastore, so you need to declare it in your index.yaml.

In the last years, Google added the zigzag merge join algorithm, which is basically a way to take 2 composite indexes (that start with the same attributes, so there is common ground between the 2 sub-queries) and run 2 sub-queries on them, then have the algorithm join the responses of both sub-queries.