norbertpy norbertpy - 28 days ago 8
reST (reStructuredText) Question

REST API - Bulk Create or Update in single request

Let's assume there are two resources

Binder
and
Doc
with association relationship meaning that the
Doc
and
Binder
stand on their own.
Doc
might or might not belong to
Binder
and
Binder
might be empty.

If I want to design a REST API that allows user to send a collection of
Doc
s, IN A SINGLE REQUEST, like the following:

{
"docs": [
{"doc_number": 1, "binder": 1},
{"doc_number": 5, "binder": 8},
{"doc_number": 6, "binder": 3}
]
}


And for each doc in the
docs
,


  • If the
    doc
    exists then assign it to
    Binder

  • If the
    doc
    doesn't exist, create it and then assign it



I am really confused as to how this should be implemented.


  • What HTTP method to use?

  • What response code must be returned?

  • Is this even qualified for REST?

  • How would the URI look like?
    /binders/docs
    ?

  • Handling bulk request, what if a few items raise error but the other go through. What response code must be returned? Should the bulk operation be atomic?


Answer

I think that you could use a POST or PATCH method to handle this since they typically design for this.

  • Using a POST method is typically used to add an element when used on list resource but you can also support several actions for this method. See this answer: How to Update a REST Resource Collection. You can also support different representation formats for the input (if they correspond to an array or a single elements).

    In the case, it's not necessary to define your format to describe the update.

  • Using a PATCH method is also suitable since corresponding requests correspond to a partial update. According to RFC5789 (http://tools.ietf.org/html/rfc5789):

    Several applications extending the Hypertext Transfer Protocol (HTTP) require a feature to do partial resource modification. The existing HTTP PUT method only allows a complete replacement of a document. This proposal adds a new HTTP method, PATCH, to modify an existing HTTP resource.

    In the case, you have to define your format to describe the partial update.

I think that in this case, POST and PATCH are quite similar since you don't really need to describe the operation to do for each element. I would say that it depends on the format of the representation to send.

The case of PUT is a bit less clear. In fact, when using a method PUT, you should provide the whole list. As a matter of fact, the provided representation in the request will be in replacement of the list resource one.

You can have two options regarding the resource paths.

  • Using the resource path for doc list

In this case, you need to explicitely provide the link of docs with a binder in the representation you provide in the request.

Here is a sample route for this /docs.

The content of such approach could be for method POST:

[
    { "doc_number": 1, "binder": 4, (other fields in the case of creation) },
    { "doc_number": 2, "binder": 4, (other fields in the case of creation) },
    { "doc_number": 3, "binder": 5, (other fields in the case of creation) },
    (...)
]
  • Using sub resource path of binder element

In addition you could also consider to leverage sub routes to describe the link between docs and binders. The hints regarding the association between a doc and a binder doesn't have now to be specified within the request content.

Here is a sample route for this /binder/{binderId}/docs. In this case, sending a list of docs with a method POST or PATCH will attach docs to the binder with identifier binderId after having created the doc if it doesn't exist.

The content of such approach could be for method POST:

[
    { "doc_number": 1, (other fields in the case of creation) },
    { "doc_number": 2, (other fields in the case of creation) },
    { "doc_number": 3, (other fields in the case of creation) },
    (...)
]

Regarding the response, it's up to you to define the level of response and the errors to return. I see two levels: the status level (global level) and the payload level (thinner level). It's also up to you to define if all the inserts / updates corresponding to your request must be atomic or not.

  • Atomic

In this case, you can leverage the HTTP status. If everything goes well, you get a status 200. If not, another status like 400 if the provided data aren't correct (for example binder id not valid) or something else.

  • Non atomic

In this case, a status 200 will be returned and it's up to the response representation to describe what was done and where errors eventually occur. ElasticSearch has an endpoint in its REST API for bulk update. This could give you some ideas at this level: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/bulk.html.

  • Asynchronous

You can also implement an asynchronous processing to handle the provided data. In this case, the HTTP status returns will be 202. The client needs to pull an additional resource to see what happens.

Before finishing, I also would want to notice that the OData specification addresses the issue regarding relations between entities with the feature named navigation links. Perhaps could you have a look at this ;-)

The following link can also help you: https://templth.wordpress.com/2014/12/15/designing-a-web-api/.

Hope it helps you, Thierry