Simon Suh Simon Suh - 6 months ago 10
SQL Question

Is it bad practice to create a SQL database table without a numerical primary key?

If I created a table such as the following with just three columns:

column1 - column2 - column3
name - company - color

Is this bad practice in database table planning because it doesn't have a Primary Key with a auto-incrementing numerical ID?

Answer

This is flame-war territory which is why I assume the question will eventually be closed. Worse, some considerations depend on your database structure. But the basic issues include access vs natural primary keys and specifics regarding table layout in your RDBMS.

I want to go over both sides of the controversy however, and discuss some issues in each. Personally when I work with PostgreSQL, I prefer to use natural primary keys and usually add numeric secondary keys to make joins easier.

Natural vs Synthetic Primary Keys In the Abstract

In general I find natural primary keys cleaner semantically. The PRIMARY KEY designation thus acts partly as documentation as to what is functionally dependent on what. Within a given table, this can make longer-term maintenance much easier, particularly if you are shooting for well-nromalized tables.

The problem however is that often natural primary keys span columns and as requirements change the tables can change such that the primary keys can change. This makes change management a big problem particularly as tables grow and so very often one wants a layer of abstraction around that problem, which means a second, autoincrement field used for joins.

Access vs Data Sematics

In PostgreSQL there is no effectively no difference between a combination of a not null and a unique constraint and a primary key, so this works well. However this is not the case everywhere. In MySQL with InnoDB, the table is index-oriented around the primary key, so primary key lookups are optimized at the expense of other lookups.

For this reason in a database system like that, you will find there is a serious performance benefit to making the surrogate key primary and finding another way to document the natural primary key.

Note other database systems may allow for index oriented tables around indexes other than the primary key, which would lead to this consideration coming out differently yet.

For single column tables (used to effectively enforce an enum type where the db does not support this natively) however, I see absolutely no value in adding an additional numeric primary key.

Conclusion

Whether or not this is bad practice depends on what specifically you are doing and the database you are using. It isn't necessarily a bad thing but it can be. There are issues with both approaches and they play out different in different scenarios and different database systems. But hopefully the above helps introduce the issues.

Comments