Mickael Marrache Mickael Marrache - 2 months ago 10
reST (reStructuredText) Question

Should the natural or surrogate key be returned in an API?

First time I think about it...

Until now, I always used the natural key in my API. For example, a REST API allowing to deal with entities, the URL would be like

/entities/{id}
where id is a natural key known to the user (the ID is passed to the POST request that creates the entity). After the entity is created, the user can use multiple commands (GET, DELETE, PUT...) to manipulate the entity. The entity also has a surrogate key generated by the database.

Now, think about the following sequence:


  1. A user creates entity with id 1. (
    POST /entities
    with body containing id 1)

  2. Another user deletes the entity (
    DELETE /entities/1
    )

  3. The same other user creates the entity again (
    POST /entities
    with body containing id 1)

  4. The first user decides to modify the entity (
    PUT /entities/1
    with body)



Before step 4 is executed, there is still an entity with id 1 in the database, but it is not the same entity created during step 1. The problem is that step 4 identifies the entity to modify based on the natural key which is the same for the deleted and new entity (while the surrogate key is different). Therefore, step 4 will succeed and the user will never know it is working on a new entity.

I generally also use optimistic locking in my applications, but I don't think it helps here. After step 1, the entity's version field is 0. After step 3, the new entity's version field is also 0. Therefore, the version check won't help. Is the right case to use timestamp field for optimistic locking?

Is the "good" solution to return surrogate key to the user? This way, the user always provides the surrogate key to the server which can use it to ensure it works on the same entity and not on a new one?

Which approach do you recommend?

Thanks,
Mickael

Answer

It depends on how you want your users to user your api.

REST APIs should try to be discoverable. So if there is benefit in exposing natural keys in your API because it will allow users to modify the URI directly and get to a new state, then do it.

A good example is categories or tags. We could have these following URIs;

GET /some-resource?tag=1    // returns all resources tagged with 'blue'
GET /some-resource?tag=2    // returns all resources tagged with 'red'

or

GET /some-resource?tag=blue    // returns all resources tagged with 'blue'
GET /some-resource?tag=red    // returns all resources tagged with 'red'

There is clearly more value to a user in the second group, as they can see that the tag is a real word. This then allows them to type ANY word in there to see whats returned, whereas the first group does not allow this: it limits discoverability

A different example would be orders

GET /orders/1   // returns order 1

or

GET /orders/some-verbose-name-that-adds-no-meaning    // returns order 1

In this case there is little value in adding some verbose name to the order to allow it to be discoverable. A user is more likely to want to view all orders first (or a subset) and filter by date or price etc, and then choose an order to view

GET /orders?orderBy={date}&order=asc

Additional

After our discussion over chat, your issue seems to be with versioning and how to manage resource locking.

If you allow resources to be modified by multiple users, you need to send a version number with every request and response. The version number is incremented when any changes are made. If a request sends an older version number when trying to modify a resource, throw an error.

In the case where you allow the same URIs to be reused, there is a potential for conflict as the version number always begins from 0. In this case, you will also need to send over a GUID (surrogate key) and a version number. Or don't use natural URIs (see original answer above to decided when to do this or not).

There is another option which is to disallow reuse of URIs. This really depends on the use case and your business requirements. It may be fine to reuse a URI as conceptually it means the same thing. Example would be if you had a folder on your computer. Deleting the folder and recreating it, is the same as emptying the folder. Conceptually the folder is the same 'thing' but with different properties.

User account is probably an area where reusing URIs is not a good idea. If you delete an account /accounts/u1, that URI should be marked as deleted, and no other user should be able to create an account with username u1. Conceptually, a new user using the same URI is not the same as when the previous user was using it.

Comments