bluewater_sailor bluewater_sailor - 2 months ago 11
MySQL Question

Normalizing data from a pair of CSV tables

I'm trying to normalize some data, and can't seem to come up with a solution. What I have is a table like this:

weight position1 position2 position3
1 10 20 30
2 25 35 45
3 17 05 22


and one like this:

location position
6 1
7 1
8 2
9 2
10 2
11 3
12 3


How do I normalize the above so that given a location and a weight, I can find the value for a given position?

I can use Perl, Python, Excel, MySQL or pretty much any tool on the block to do the actual reshuffling of the data; where I'm having a problem is in coming up with a reasonable schema.




The desired outcome here is something like

if location == 11 -> position is 3


therefore,

if weight == 2 -> the value is 45

Answer

The only thing to do is "unpivot" your first table to this:

weight  position     value
   1        1          10
   1        2          20
   1        3          30
   2        1          25
   2        2          35
   2        3          45
   3        1          17
   3        2          05
   3        3          22

The first two columns should contain unique pairs of values. If you have other information that only depends on weight, you would need another table for that. Same for positions.

Converting to the new model

If you already have the tables, then you can create the first table (t1) with this statement:

create table t1_new
select weight, 1 as position, position1 as value
from   t1
union all
select weight, 2 as position, position2 as value
from   t1
union all
select weight, 3 as position, position3 as value
from   t1

Then, after verification of the result, drop t1, and rename t1_new to t1.

Querying from the new model

To query from these tables the value for a given location and weight, you should use a join:

select     value
from       t1
inner join t2 on t2.weight = t1.weight
where      t2.location = 11
and        t1.position = 3