user3494949 user3494949 - 4 months ago 15
Linux Question

How to delete duplicated rows based in a column value?

Given the following table

123456.451 entered-auto_attendant
123456.451 duration:76 real:76
139651.526 entered-auto_attendant
139651.526 duration:62 real:62`
139382.537 entered-auto_attendant


Using a bash shell script based in Linux, I'd like to delete all the rows based on the value of column 1 (The one with the long number). Having into consideration that this number is a variable number

I've tried with

awk '{a[$3]++}!(a[$3]-1)' file


sort -u | uniq


But I am not getting the result which would be something like this, making a comparison between all the values of the first column, delete all the duplicates and show it

123456.451 entered-auto_attendant
139651.526 entered-auto_attendant
139382.537 entered-auto_attendant

Answer

you didn't give an expected output, does this work for you?

 awk '!a[$1]++' file

with your data, the output is:

123456.451 entered-auto_attendant
139651.526 entered-auto_attendant
139382.537 entered-auto_attendant

and this line prints only unique column1 line:

 awk '{a[$1]++;b[$1]=$0}END{for(x in a)if(a[x]==1)print b[x]}' file

output:

139382.537 entered-auto_attendant