user1251385 user1251385 - 9 months ago 64
Python Question

How to detect if image is present on screen?

Here is the image I need to detect:

Here is the base64 representation:

The reason why I'm asking for your help is because this is a complex problem and I am not equipped to solve it. It will probably take me a week to do it by myself.

Some pseudo-code that I thought about:

1) Take screenshot of the app and store it as image object.

2) Convert binary64 representation of my image to image object.

3) Use some sort of algorithm/function to compare both image objects.

By on screen, I mean in an app. I have the app's window name and the PID.

To be 100% clear, I need to essentially detect if image1 is inside image2. image1 is the image I gave in the OP. image2 is a screenshot of a window.

Answer Source

If you break this down into pieces, they're all pretty simple.

First, you need a screenshot of the app's window as a 2D array of pixels. There are a variety of different ways to do this in a platform-specific way, but you didn't mention what platform you're on, so… let's just grab the whole screen, using PIL:

screenshot = ImageGrab.grab()
haystack = screenshot.load()

Now, you need to convert your base64 into an image. Taking a quick look at it, it's clearly just an encoded PNG file. So:

decoded = data.decode('base64')
f = cStringIO.StringIO(decoded)
image =
needle = image.load()

Now you've got a 2D array of pixels, and you want to see if it exists in another 2D array. There are faster ways to do this—using numpy is probably best—but there's also a dumb brute-force way, which is a lot simpler to understand: just iterate the rows of haystack; for each one, iterate the columns, and see if you find a run of bytes that matches the first row of needle. If so, keep going through the rest of the rows until you either finish all of needle, in which case you return True, or find a mismatch, in which case you continue and just start again on the next row.