jungyh0218 - 1 month ago 10

MySQL Question

I am making a simple public cashbook DB table to calculate how much my friends and I paid for having meal together.

My first table was only a single table and it had a column "person" with non-atomic values, so I divided my table into two tables like above.

But I'm not sure this is normalized enough. Is there any functional dependency which should be normalized but I don't know its existence?

(I'm going to use MySql but you can answer me with any kind of DBMS.)

Answer

Normalization requires knowing functional dependencies (FDs) and join dependencies (FDs). You didn't give them.

**Your application** We can only tell you the FDs and JDs if we know exactly what is "the relationship that I wanted to express" (ie predicate) for each table (ie to the point where we could look at a situation and know for every possible row whether it makes a true proposition from the predicate and so belongs in the table) and exactly what possible situations can arise (via "business rules" about possible application situations, equivalent to constraints on possible database states).

**Your "keys"** You didn't give FDs. You just gave one candidate key (CK) and a "unique key". But you can't determine some or all the CKs without knowing certain things about the FDs. So when you give CKs that's the same as saying there are and are not certain FDs. You need to tell us what you have given us when you give some column sets labelled PK or "unique key": Is a "unique key" a CK (containing no smaller unique subset) or just a superkey (unique)? Did you give all the CKs or could there be others? Could there be other superkeys than the supersets of the given ones? It is most helpful if you just tell us exactly which FDs you know hold (via a minimal cover) and which you know don't hold.

**Guesses**

I have no idea what `is_calculated`

is for.

Maybe a given `date`

and `meal`

pair has exactly one `content`

?

**TL;DR** You really need to check each possible set of columns to see whether it functionally determines each other column. Ie whether for each database state a subrow of values for the set of columns only appears with one value for the column. We can only guess without a clear understanding of your predicates and your application. You can reduce work by the following: If a set of columns is unique then its supersets determine all other columns. If a set of columns is minimally unique (is a CK) then none of its smaller subsets determine all other columns. You can look for counterexamples to putative FDs, where two rows can have the same subrow of values for a putative determinant but have different values for a putative determined attribute. Armstrong's axioms generate all the FDs implied by given ones.

**JDs** Normalization to 4NF and 5NF involves splitting table into multiple tables to eliminate JDs that aren't implied by CKs. A table's predicate can be expressed as the AND of others exactly when a JD holds corresponding to the predicates' column sets. A relation is in 5NF when every conjunct in every JD overlaps some other one on at least a CK. (Fagin's membership algorithm.)

**PS** If you care about constraints, did you lose one when you went from one table to two?