BMannell BMannell - 3 months ago 15
MySQL Question

Add a boolean field vs check if field IS NOT NULL?

We use tables that require admin approval of records before the records become publicly visible. I am wondering what the most appropriate way would be to design such a table where the main query is to retrieve records that have been approved (or not yet approved).

Assuming the queried column will be indexed:


  1. Is there any speed benefit to using a boolean field?

  2. Is checking whether a column is NULL go against best practices?



For example:

id | title | text | approved_dttm
---------------------------------------
1 | ... | ... | null
2 | ... | ... | 2017-01-01 00:00:00 ETC


SELECT * FROM table where approved_dttm IS NOT NULL;


vs

id | title | text | approved | approved_dttm
---------------------------------------
1 | ... | ... | 0 | null
2 | ... | ... | 1 | 2017-01-01 00:00:00 ETC


SELECT * FROM table where approved = 1;


Note: We do not need multiple states other than approved/not-approved. No "needs further review" etc.

Answer Source

Q: Is there any speed benefit to using a boolean field?

A: NO.

In this case, it's very unlikely that there would be any query would obtain a "speed benefit" from using the approved column vs approved_dttm IS [NOT] NULL.

While the additional byte for the approved column is negligible (assuming that's defined as TINYINT, that extra byte will not really impact the number of rows that "fit" in a block)... an index on that column would not be negligible. That would require additional blocks (space), and would add overhead for maintenance of the index entries.

We can't exclude some exceptional corner case where the addition of that column would be of benefit, but in general, given the information provided, no, there's no "speed benefit" to adding that column.

(And we're setting aside here a discussion of redundant data and update anomalies... adding the (redundant) approved column flies in the face of third normal form, and the familiar mantra "Every attribute is dependent on the key, the whole key, and nothing but the key. So help me Codd".)

Q: Is checking whether a column is NULL go against best practices?

A: NO.

It doesn't go against any "best practices" I'm aware of. The NULL and tri-valued boolean logic has been around forever (since E.F.Codd first coined "relational" in 1970, the advent of System/R and Oracle in 1977, and DB2 in 1983, ...)

Some set of application developers may not like (or understand) how to deal with the nuances of NULL values. And it's true that having a column defined as NOT NULL may somewhat ease their burden. But in my book, "avoiding dealing with NULLs" is not a "best practice".

We do note that there are some quirks with some database implementations, not making use of an index to satisfy a col IS NULL predicate. But those quirks are usually overcome with appropriately defined indexes and carefully written queries. Understanding NULL values and their quirks and dealing with it is a "best practice".