pali pali - 2 months ago 6
Linux Question

Remove rows have duplicate value in two columns

I have a file with four columns

3022751,6656,7656,T029957
3022751,6054,7054,T029957
3022751,10400,10400,T029958
3022751,10400,10400,T029958


I want to remove the rows which have duplicates in column 2 and 3. So my expected output is like this

3022751,6656,7656,T029957
3022751,6054,7054,T029957


My this awk script is working fine but not deleting the duplicated row like this

awk '!x[$2,$3]++' FS=","


current output is

3022751,6656,7656,T029957
3022751,6054,7054,T029957
3022751,10400,10400,T029958


Thanks.

Answer
awk -F, '$2!=$3' file

Read the book Effective Awk Programming, 4th Edition, by Arnold Robbins.