ams ams - 3 months ago 12
SQL Question

Any disadvantages to bit flags in database columns?

Consider the following tables:

CREATE TABLE user_roles(
pkey SERIAL PRIMARY KEY,
bit_id BIGINT NOT NULL,
name VARCHAR(256) NOT NULL,
);

INSERT INTO user_roles (bit_id,name) VALUES (1,'public');
INSERT INTO user_roles (bit_id,name) VALUES (2,'restricted');
INSERT INTO user_roles (bit_id,name) VALUES (4,'confidential');
INSERT INTO user_roles (bit_id,name) VALUES (8,'secret');

CREATE TABLE news(
pkey SERIAL PRIMARY KEY,
title VARCHAR(256),
company_fk INTEGER REFERENCES compaines(pkey), -- updated since asking the question
body VARCHAR(512),
read_roles BIGINT -- bit flag
);


read_roles is a bit flags that specifies some combination of roles that can read news items. So if I am inserting a news item that can be read by restricted and confidential I would set read_roles to have a value of
2 | 4
or 6 and when I want to get back the news posts that a particular user can see I can use a query like.

select * from news WHERE company_fk=2 AND (read_roles | 2 != 0) OR (read_roles | 4 != 0) ;
select * from news WHERE company_fk=2 AND read_roles = 6;


What are disadvantages of using bit flags in database columns in general? I am assuming the answer to this question might be database specific so I am interested in learning about disadvantages with specific databases.

I am using Postgres 9.1 for my application.

UPDATE I got the bit about the database not being to use an index for bit operations which would require a full table scan which would suck for performance. So I have updated the question to reflect my situation more closely, each row in the database belongs to a specific company so all the queries will have WHERE clause that include a company_fk which will have an index on it.

UPDATE I only have 6 roles right now, possible more in the future.

UPDATE roles are not mutually exclusive and they inherit from each other, for example, restricted inherits all the permissions assigned to public.

Answer

If you only have a handful of roles, you don't even save any storage space in PostgreSQL (which you use). An integer column uses 4 bytes, a bigint 8 bytes. Both need to be aligned according to be MAXALIGN. A boolean column uses one byte. Effectively, you can fit four or more boolean columns for one integer column. Eight or more for a bigint.

You must also take into account that NULL values only take up one bit (simplified) in the NULL bitmap.

Individual columns are much easier to read and index. Others have commented on that already.

You could still utilize indexes on expressions or partial indexes to circumvent the problem with indexes ("non-sargable") to some extent. Generalized statements like

database cannot use indexes on a query like this

or

These conditions are non-SARGable!

are not entirely true - maybe for some others RDBMS lacking these features.
But why circumvent when you can avoid the problem altogether?


If these flags were mutually exclusive, you could use one column of type enum or a small look-up table and a foreign key referencing it. (Ruled out in question update.)


As you have clarified, we are talking about 6 distinct types (maybe more). Go with individual boolean columns. You'll probably even save space compared to one bigint. Space requirement seems immaterial in this case.

Comments