Michael Michael - 14 days ago 6
SQL Question

Database Table Design Issues

I am new to DB Design and I've recently inherited the responsibility of adding some new attributes to an existing design.

Below is a sample of the current table in question:

Submission Table:

ID (int)
Subject (text)
Processed (bit)
SubmissionDate (datetime)
Submitted (bit)
...


The new requirements are:


  1. A Submission can be marked as valid or invalid

  2. A Reason must be provided when a Submission is marked as invalid. (So a submission may have an InvalidReason)

  3. Submissions can be associated with one another such that: Multiple valid Submissions can be set as "replacements" for an invalid Submission.



So I've currently taken the easy solution and simply added new attributes directly to the Submission Table so that it looks like this:

NEW Submission Table:

ID (int)
Subject (text)
Processed (bit)
SubmissionDate (datetime)
Submitted (bit)
...
IsValid (bit)
InvalidReason (text)
ReplacedSubmissionID (int)


Everything works fine this way, but it just seems a little strange:


  1. Having InvalidReason as a column that will be NULL for majority of submissions.

  2. Having ReplacedSubmissionID as a column that will be NULL for majority of submissions.

  3. If I understand normalization right, InvalidReason might be transitively dependent on the IsValid bit.



It just seems like somehow some of these attributes should be extracted to a separate table, but I don't see how to create that design with these requirements.

Is this single table design okay? Anyone have better alternative ideas?

Answer

Whether or not you should have a single table design really depends on 1) How you will be querying the data 2) How much data would end up being potentially NULL in the resulting table.

In your case its probably ok, but again it depends on #1. If you will be querying separately to get information on invalid submissions, you may want to create a separate table that references the Id of invalid submissions and the reason:

New table: InvalidSubmissionInfo
Id (int) (of invalid submissions; will have FK contraint on Submission table)
InvalidReason (string)

Additionally if you will be querying for replaced submissions separately you may want to have a table just for those:

New table: ReplacementSubmissions
Id (int) (of the replacement submissions; will have FK contraint on Submission table)
ReplacedSubmissionId (int) (of what got replaced; will have FK constraint on submission table)

To get the rest of the info you will still have to join with the Submissions table.

All this to say you do not need separate this out into multiple tables. Having a NULL value only takes up 1 bit of memory which isn't bad. And if you need to query and return an entire Submission record each time, it makes more sense to condense this info into one table.

Comments