MainstreamDeveloper00 MainstreamDeveloper00 - 1 year ago 190
Objective-C Question

Why Tesseract OCR library (iOS) cannot recognize text at all?

I'm trying to use Tesseract OCR library in my iOS application. I downloaded tesseract-ios library from github and when I tried to recognize a simple text image i got garbage instead. Here is a image what I tried to recognize:

enter image description here

I gor unreadable text:

T0I1101T0W KIR1
H1I1101T0W KIR1
EHH11 133I R1 11335
11I1H1 19 13S SYIL
3B19 M H300H1911
H1113 AIR1 J1 OIII
3I9SH5H133IS 13V9
I1 Q1H211 E015 19
W331 H1 111SW

Why Tesseract can't recognise even simple image?.Here is code which I used to instantiate Tesseract:

Tesseract* tesseractObject = [[Tesseract alloc] initWithDataPath:@"tessdata" language:@"eng"];
[tesseractObject setVariableValue:@"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ" forKey:@"tessedit_char_whitelist"];
[tesseractObject setImage:image];
[tesseractObject recognize];
NSLog(@"RECOGNISED= %@" , [tesseractObject recognizedText]);

Here is my project structure:
enter image description here
I added English tessdata folder by reference.So what am i doing wrong? How can I fix this?

Answer Source

Make sure you have the latest tessdata file from Google code

This will provide you with a list of tessdata files that you need to download and include in your app if you haven't already. In your case you will need tesseract-ocr-3.02.eng.tar.gz as you are looking for the English language files

The following article will show you where you need to install it. I read through this tutorial when I built my first Tesseract project and found it really useful

