Joel Verhagen Joel Verhagen - 3 months ago 17
PHP Question

How do I programmatically check whether an image (PNG, JPEG, or GIF) is corrupted?

Okay. So I have about 250,000 high resolution images. What I want to do is go through all of them and find ones that are corrupted. If you know what 4scrape is, then you know the nature of the images I.

Corrupted, to me, is the image is loaded into Firefox and it says


The image “such and such image” cannot be displayed, because it contains errors.



Now, I could select all of my 250,000 images (~150gb) and drag-n-drop them into Firefox. That would be bad though, because I don't think Mozilla designed Firefox to open 250,000 tabs. No, I need a way to programmatically check whether an image is corrupted.

Does anyone know a PHP or Python library which can do something along these lines? Or an existing piece of software for Windows?

I have already removed obviously corrupted images (such as ones that are 0 bytes) but I'm about 99.9% sure that there are more diseased images floating around in my throng of a collection.

Answer

An easy way would be to try loading and verifying the files with PIL (Python Imaging Library).

from PIL import Image

v_image = Image.open(file)
v_image.verify()

Catch the exceptions...

From the documentation:

im.verify()

Attempts to determine if the file is broken, without actually decoding the image data. If this method finds any problems, it raises suitable exceptions. This method only works on a newly opened image; if the image has already been loaded, the result is undefined. Also, if you need to load the image after using this method, you must reopen the image file.