user1397215 user1397215 - 9 months ago 62
Python Question

Make tesseract recognise numbers only

I am trying to refine an OCR prog I made to read the layout of a certain image that I am using. Right now, I would like my OCR prog to recognise only digits 0-9.

I tried to follow the solution from the question:

Limit characters tesseract is looking for

But I got stuck in the part where I have to call tesseract as:

tesseract input.tif output nobatch letters

where does this go?

Answer Source

I posted some things about tesseract some time ago in SO: see Tesseract OCR Library - Learning Font. There is notably a link to tesseract training which will tell you how to restrain your set of characters and describe your ambiguities.