AishwaryaKulkarni - 2 years ago 55
Bash Question

Finding zeros and replacing them with another number in a matrix file by awk

I have a matrix where I want to replace every 0 with 0.1 and depending on how many zeros are replaced the max score in that line will be deducted by number of 0.1s added such that the below matrix will go from,

No line will contain only zeroes, since this is a probability matrix where each line adds up to1. If a highest number occurs more than once (0.5 in this case), then anyone can be changed,and the first line will always be the only one with letters in it,

``````>ACTTT  ASB  0.098
0   0      1    0
0.75   0   0.25    0
0   0      0    1
0   1      0    0
1   0      0    0
1   0      0    0
0   1      0    0
0   1      0    0
``````

to

``````>ACTTT  ASB  0.098
0.1   0.1      0.7    0.1
0.55   0.1   0.25    0.1
0.1  0.1      0.1    0.7
0.1   0.7      0.1    0.1
0.7   0.1      0.1    0.1
0.7   0.1      0.1    0.1
0.1   0.7      0.1    0.1
0.1   0.7      0.1    0.1
``````

I tried to use something like this in a loop from previous answers in here:

``````   while read line ; do echo \$line | awk 'NR>1{print gsub(/(^|[[:space:]])0([[:space:]]|\$)/,"&")}'; echo \$line | awk '{max=\$2;for(i=3;i<=NF;i++)if(\$i>max)max=\$i}END{print max}'; done < matrix_file
``````

`awk` to the rescue!

``````\$ awk -v eps=0.01 'function maxIx() {mI=1;
for(i=1;i<=NF;i++)
if(\$mI<\$i)mI=i;
return mI}
NR>1{mX=maxIx();
for(i=1;i<=NF;i++)
if(\$i==0) {\$i=eps;\$mX-=eps}}1' file

>ACTTT  ASB  0.098
0.01 0.01 0.97 0.01
0.73 0.01 0.25 0.01
0.01 0.01 0.01 0.97
0.01 0.97 0.01 0.01
0.97 0.01 0.01 0.01
0.97 0.01 0.01 0.01
0.01 0.97 0.01 0.01
0.01 0.97 0.01 0.01
``````

defined eps, as long as you have a sensible value it should work fine, but doesn't check for going below zero.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download