spencerJANG spencerJANG - 2 months ago 26
Python Question

How to classify blurry numbers with openCV

I would like to capture the number from this kind of picture.

enter image description here

I tried multi-scale matching from the following link.

http://www.pyimagesearch.com/2015/01/26/multi-scale-template-matching-using-python-opencv/

All I want to know is the red number. But the problem is, the red number is blurry for openCV recognize/match template. Would there be other possible way to detect this red number on the black background?

Answer

Classifying Digits

You clarified in comments that you've already isolated the number part of the image pre-detection, so I'll start under that assumption.

Perhaps you can approximate the perspective effects and "blurriness" of the number by treating it as a hand-written number. In this case, there is a famous data-set of handwritten numerals for classification training called mnist.

Yann LeCun has enumerated the state of the art on this dataset here mnist hand-written dataset.

At the far end of the spectrum, convolutional neural networks yield outrageously low error rates (fractions of 1% error). For a simpler solution, k-nearest neighbours using deskewing, noise removal, blurring, and 2 pixel shift, yielded about 1% error, and is significantly faster to implement. Python opencv has an implementation. Neural networks and support vector machines with deskewing also have some pretty impressive performance rates.

Note that convolutional networks don't have you pick your own features, so the important color-differential information here might just be used for narrowing the region-of-interest. Other approaches, where you define your feature space, might incorporate the known color difference more precisely.

Python supports a lot of machine learning techniques in the terrific package sklearn - here are examples of sklearn applied to mnist. If you're looking for an tutorialized explanation of machine learning in python, sklearn's own tutorial is very verbose

From the sklearn link: Classifying mnist

Those are the kinds of items you're trying to classify if you learn using this approach. To emphasize how easy it is to start training some of these machine learning-based classifiers, here is an abridged section from the example code in the linked sklearn package:

digits = datasets.load_digits() # built-in to sklearn!
data = digits.images.reshape((len(digits.images), -1))

# Create a classifier: a support vector classifier
classifier = svm.SVC(gamma=0.001)

# We learn the digits on the first half of the digits
classifier.fit(data[:n_samples / 2], digits.target[:n_samples / 2])

If you're wedded to openCv (possibly because you want to port to a real-time system in the future), opencv3/python has a tutorial on this exact topic too! Their demo uses k-nearest-neighbor (listed in the LeCun page), but they also have svms and the many of the other tools in sklearn. Their ocr page using SVMs uses deskewing, which might be useful with the perspective effect in your problem:

Deskewed digit


UPDATE: I used the out-of-the box skimage approach outlined above on your image, heavily cropped, and it correctly classified it. A lot more testing would be required to see if this is rhobust in practice

enter image description here

^^ That tiny image is the 8x8 crop of the image you embedded in your question. mnist is 8x8 images. That's why it trains in less than a second with default arguments in skimage.

I converted it the correct format by scaling it up to the mnist range using

number = scipy.misc.imread("cropped_image.png")
datum  =  (number[:,:,0]*15).astype(int).reshape((64,))
classifier.predict(datum) # returns 8

I didn't change anything else from the example; here, I'm only using the first channel for classification, and no smart feature computation. 15 looked about right to me; you'll need to tune it to get within the target range or (ideally) provide your own training and testing set


Object Detection

If you haven't isolated the number in the image you'll need an object detector. The literature space on this problem is gigantic and I won't start down that rabbit hole (google Viola and Jones, maybe?) This blog covers the fundamentals of a "sliding window" detector in python. Adrian Rosebrock looks like he's even a contributor on SO, and that page has some good examples of opencv and python-based object detectors fairly tutorialized (you actually linked to that blog in your question, I didn't realize).

In short, classify windows across the image and pick the window of highest confidence. Narrowing down the search space with a region of interest will of course yield huge improvements in all areas of performance