user2328149 - 2 years ago 66
Bash Question

# Comparing csv files using field values

I have some csv files with the following format in the same folder:

``````Name - Value - Number - Key
``````

I want to compare these files pairwise and give them a score in this way. If all the names in the Name column of the first file aren't in the corresponding column of the second file the score will be 0.
Instead, if they are, the score will be given as shown in this example:

File1.csv

``````NameA, ValueA, NumberA, KeyA

Jack,  10,     9,       12
Alex,  30,     2,       16
Mark,  15,     3,       18
``````

File2.csv

``````NameB, ValueB, NumberB, KeyB

Jack,  13,      4,       11
Alex,  22,      5,       18
Bill,  67,      6,       20
Mark   18,      8,       26
``````

Score = abs(11 - 10)/9 + abs(18 - 30)/2 + abs(26 - 15)/3

So it will be given by the summation of the abs(KeyB - ValueA)/NumberA scores, where abs is the absolute value of the subtraction.

How can I do this?

First of all, based on your formula, `abs(KeyB - ValueA)/NumberA`, you should have

``````    Score = abs(11 - 10)/9 + abs(18 - 30)/2 + abs(26 - 15)/3 = 9.7778
``````

``````    Score = abs(11 - 10)/9 + abs(18 - 30)/2 + abs(26 - 15)/18
``````     awk -F, 'function abs(x){return ((x < 0.0) ? -x : x)}