Murlidhar Fichadia - 1 year ago 124
Bash Question

# Min-Max Normalization using AWK

I dont know Why I am unable to loop through all the records. currently it goes for last record and prints the normalization for it.

# Normalization formula:

New_Value = (value - min[i]) / (max[i] - min[i])

# Program

``````{
for(i = 1; i <= NF; i++)
{
if (min[i]==""){  min[i]=\$i;}     #initialise min
if (max[i]==""){  max[i]=\$i;}     #initialise max
if (\$i<min[i]) {  min[i]=\$i;}     #new min
if (\$i>max[i]) {  max[i]=\$i;}     #new max
}

}
END {
for(j = 1; j <= NF; j++)
{
normalized_value[j] = (\$j - min[j])/(max[j] - min[j]);
print \$j, normalized_value[j];
}
}
``````

# Dataset

``````4 14 24 34
3 13 23 33
1 11 21 31
2 12 22 32
5 15 25 35
``````

# Current Output

``````5 1
15 1
25 1
35 1
``````

# Required Output

``````0.75 0.75 0.75 0.75
0.50 0.50 0.50 0.50
0.00 0.00 0.00 0.00
0.25 0.25 0.25 0.25
1.00 1.00 1.00 1.00
``````

I would process the file twice, once to determine the minima/maxima, once to calculate the normalized values:

``````awk '
NR==1 {
for (i=1; i<=NF; i++) {
min[i]=\$i
max[i]=\$i
}
next
}
NR==FNR {
for (i=1; i<=NF; i++) {
if      (\$i < min[i]) {min[i]=\$i}
else if (\$i > max[i]) {max[i]=\$i}
}
next
}
{
for (i=1; i<=NF; i++) printf "%.2f%s", (\$i-min[i])/(max[i]-min[i]), FS
print ""
}
' file file
# ^^^^ ^^^^  same file twice!
``````

outputs

``````0.75 0.75 0.75 0.75
0.50 0.50 0.50 0.50
0.00 0.00 0.00 0.00
0.25 0.25 0.25 0.25
1.00 1.00 1.00 1.00
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download