Byyo - 12 days ago 11
C# Question

# Algorithm to compare two images in C#

I'm writing a tool in C# to find duplicate images. Currently i create a MD5 checksum of the files and compare those.

Unfortunately my images can be

• rotated by 90 degrees

• have different dimensions (smaller image with same content)

• have different compressions or filetypes (e.g. jpeg artifacts, see below)

what would be the best approach to solve this problem?

Here is a simple approach with a 256 bit image-hash (MD5 has 128 bit)

1. resize the picture to 16x16 pixel

1. reduce colors to black/white (which equals true/false in this console output)

1. read the boolean values into a `BitArray`, `bool[]`, `List<bool>` or something - this is the hash

Code:

``````public static List<bool> GetHash(Bitmap bmpSource)
{
List<bool> lResult = new List<bool>();
//create new image with 16x16 pixel
Bitmap bmpMin = new Bitmap(bmpSource, new Size(16, 16));
for (int j = 0; j < bmpMin.Height; j++)
{
for (int i = 0; i < bmpMin.Width; i++)
{
//reduce colors to true / false
}
}
return lResult;
}
``````

I know, `GetPixel` is not that fast but on a 16x16 pixel image it should not be the bottleneck.

1. compare this hash to hash values from other images and add a tolerance.(number of pixels that can differ from the other hash)

Code:

``````List<bool> iHash1 = GetHash(new Bitmap(@"C:\mykoala1.jpg"));
List<bool> iHash2 = GetHash(new Bitmap(@"C:\mykoala2.jpg"));

//determine the number of equal pixel (x of 256)
int equalElements = iHash1.Zip(iHash2, (i, j) => i == j).Count(eq => eq);
``````

So this code is able to find equal images with:

• different file formats
• rotation (90, 180, 270) - by changing the iteration order of i and j
• different size (same aspect is required)
• different compression (tolerance is required in case of quality loss like jpeg artifacts) - you can accept a 99% equality to be the same image and 50% to be a different one.
Source (Stackoverflow)