010110110101 010110110101 - 1 year ago 44
SQL Question

Handling null or missing attributes in JSON using PostgreSQL

I'm learning how to handle JSON in PostgreSQL.

I have a table with some columns. One column is a JSON field. The data in that column has at least these three variations:

Case 1: {"createDate": 1448067864151, "name": "world"}
Case 2: {"createDate": "", "name": "hello"}
Case 3: {"name": "sky"}

Later on, I want to select the createDate.

TO_TIMESTAMP((attributes->>'createDate')::bigint * 0.001)

That works fine for Case 1 when the data is present and it is convertible to a bigint. But what about when it isn't? How do I handle this?

I read this article. It explains that we can add check constraints to perform some rudimentary validation. Alternatively, I could do a schema validation before the data is inserts (on the client side). There are pros and cons with both ideas.

Using a Check Constraint

CONSTRAINT validate_createDate CHECK ((attributes->>'createDate')::bigint >= 1)

This forces a non-nullable field (Case 3 fails). But I want the attribute to be optional. Furthermore, if the attribute doesn't convert to a bigint because it is blank (Case 2), this errors out.

Using JSON schema validation on the client side before insert

This works, in part, because the schema validation makes sure that what data comes in conforms to the schema. In my case, I can control which clients access this table, so this is OK. But it doesn't matter for the SQL later on since my validator will let pass all three cases.


Basically, you need to check if createDate attribute is empty:

WITH data(attributes) AS ( VALUES
  ('{"createDate": 1448067864151, "name": "world"}'::JSON),
  ('{"createDate": "", "name": "hello"}'::JSON),
  ('{"name": "sky"}'::JSON)
SELECT to_timestamp((attributes->>'createDate')::bigint * 0.001) FROM data
  (attributes->>'createDate') IS NOT NULL
  (attributes->>'createDate') != '';


 2015-11-20 17:04:24.151-08
(1 row)