Alex Raj Kaliamoorthy Alex Raj Kaliamoorthy - 6 months ago 13
Bash Question

Multiple file comparison in the same directory in unix

I have multiple files in a directory and I want to know if there is any util to compare all the files and output the difference. Or can someone help me to write a script to do this?
Edit:
I have five files that has some values. I need to know the unique values in each file and output them in another file.

Sample1.txt
001,20160512
002,20160512
003,20160512

Sample2.txt
001,20160512
004,20160512
006,20160512

Sample3.txt
004,20160512
008,20160512
007,20160512

Sample4.txt
008,20160512
005,20160512
006,20160512


My output should be comparing two files say Sample1.txt and Sample2.txt and output the unique value. For ex:

Out1.txt
Unique in Sample1.txt
002,20160512
003,20160512

Out2.txt
Unique in Sample2.txt
004,20160512
006,20160512


And so on comparing Sample2.txt and Sample3.txt and output the value in another out file and compare Sample3 and Sample4, Sample1 and Sample3, Sample1 and Sample4, Sample2 and Sample4 and generate the output in different file with a header.

I do not want to use vimdiff as there may be more than four files.

Answer

My attempt using bash array and join to store the list of files and looping to get all probability of uniqueness among files

#!/bin/bash

# List of files, can be modified as needed, can be any number of files
# The logic will work even if the files have a .txt extension, but 
# the final output file names will look odd

filelist=(file1 file2 file3 file4) 

# 'for' loop logic added to get the unique entries in each of the following combinations and in each of the files

# file1 file2
# file1 file3
# file1 file4
# file2 file3
# file2 file4
# file3 file4

# Outer for loop
for (( i=0; i<${#filelist[@]} ; i+=1 )) ; do
    # Inner for loop
    for (( j=i+1; j<${#filelist[@]} ; j+=1 )) ; do

    echo "Unique between ${filelist[i]}" "${filelist[j]}" > unique${filelist[i]}${filelist[j]}.txt

    echo -e "Unique in ${filelist[i]}"  >> unique${filelist[i]}${filelist[j]}.txt

    # Will produce unique lines in 'file i' when comparing 'file i' and 'file j'
    join -v 1 <(sort ${filelist[i]}) <(sort ${filelist[j]}) >> unique${filelist[i]}${filelist[j]}.txt

    echo -e "Unique in  ${filelist[j]}" >> unique${filelist[i]}${filelist[j]}.txt

    # Will produce unique lines in 'file j' when comparing 'file i' and 'file j'
    join -v 2 <(sort ${filelist[i]}) <(sort ${filelist[j]}) >> unique${filelist[i]}${filelist[j]}.txt

    done

done

Will output files as below

$ ls unique*
uniquefile1file2.txt  uniquefile1file3.txt  uniquefile1file4.txt  uniquefile2file3.txt  uniquefile2file4.txt  uniquefile3file4.txt

And in each file contents will be as follows

$ cat uniquefile1file2.txt
Unique between file1 file2
Unique in file1
002,20160512
003,20160512
Unique in  file2
004,20160512
006,20160512
Comments