Troels Troels - 3 years ago 153
JSON Question

When to use Json over key/value tables in postgres for billions of rows

I am doing a project, where I need to store billions of rows of unstructured history_data in a sql database (postgres) 2-3 years. The data/columns may change from day to day.

So example, day one the user might save {“user_id”:”2223”, “website”:””, “webpage”:”mysubpageName”}.

And the following day {“name”:”username”, “user_id”: “2223”, “bookclub_id”:”1” }.

I have been doing a project earlier, where we used the classic entity key/value table model for this problem. We saved maybe up to 30 key/values pr entity. But when exceeding 70-100 mill rows, the queries began to run slower and slower (too many inner joins).

Therefore i am wondering if I should change using the Json model in postgres. After searching the web, and reading blogs, I am really confused. What are the pro and con changing this to json in postgres?

Answer Source

You can think about this in terms of query complexity. If you have an index to the json documents (maybe user_id) you can do a simple index-scan to access the whole json string very fast.

You have to dissect it on the client side then, or you can pass it to functions in postgres, if e.g. you want to extract only data for specific values.

One of the most important features of postgres when dealing with json is having functional indexes. In comparison to "normal" index which index the value of a column, function indexes apply a function to a value of one (or even more) column values and index the return value. I don't know the function that extracts the value of a json string, but consider you want the user that have bookclub_id = 1. You can create an index like

create index idx_bookblub_id on mytable using getJsonValue("bookclub_id",mytable.jsonvalue)

Afterwards queries like

select * from mytable where getJsonValue("bookclub_id",mytable.jsonvalue) = 1

are lightning fast.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download