yasin.yazici yasin.yazici - 1 year ago 120
Python Question

Pre-processing before digit recognition for NN & CNN trained with MNIST dataset

I'm trying to classify handwriting digits, written by myself and a few friends, by usign NN and CNN. In order to train the NN, MNIST dataset is used. The problem is the NN trained with MNIST dataset does not give satisfying test results on my dataset. I've used some libraries on Python and MATLAB with different settings as listed below.

On Python I've used this code with setting;


  • 3-layers NN with # of inputs = 784, # of hidden neurons = 30, # of outputs = 10

  • Cost function = cross entropy

  • Number of Epochs = 30

  • Batch size = 10

  • Learning rate = 0.5



it is trained with MNIST training set, and test results are as follows:

test result on MNIST = 96%
test result on my own dataset = 80%

On MATLAB I've used deep learning toolbox with various setting, normalization included, similar to above and best accuracy of NN is around 75%.Both NN and CNN are used on MATLAB.

I've tried to resemble my own dataset to MNIST. The results above collected from pre-processed dataset. Here is the pre-processes applied to my dataset:


  • Each digit is cropped separately and resized to 28 x 28 by usign bicubic interpolation

  • Pathces are centered with the mean values in MNIST by usign bounding box on MATLAB

  • Background is 0 and highest pixel value is 1 as in MNIST



I couldn't know what to do more. There are still some differences like contrast etc., but contrast enhancement trials couldn't increase the accuracy.

Here is some digits from MNIST and my own dataset to compare them visually.

MNIST digits

my own dataset

As you may see, there is a clear contrast difference. I think the accuracy problem is because of the lack of similarity between MNIST and my own dataset. How can I handle this issue?

There is a similar question in here, but his dataset is collection of printed digits, not like mine.

Edit:
I've also tested binarized verison of my own dataset on NN trained with binarized MNIST and default MNIST. Binarization threshold is 0.05.

Here is an example image in matrix form from MNIST dataset and my own dataset, respectively. Both of them are 5.

MNIST:

Columns 1 through 10

0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0.1176 0.1412
0 0 0 0 0 0 0 0.1922 0.9333 0.9922
0 0 0 0 0 0 0 0.0706 0.8588 0.9922
0 0 0 0 0 0 0 0 0.3137 0.6118
0 0 0 0 0 0 0 0 0 0.0549
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0.0902 0.2588
0 0 0 0 0 0 0.0706 0.6706 0.8588 0.9922
0 0 0 0 0.2157 0.6745 0.8863 0.9922 0.9922 0.9922
0 0 0 0 0.5333 0.9922 0.9922 0.9922 0.8314 0.5294
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0

Columns 11 through 20

0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0.0118 0.0706 0.0706 0.0706 0.4941 0.5333 0.6863 0.1020
0.3686 0.6039 0.6667 0.9922 0.9922 0.9922 0.9922 0.9922 0.8824 0.6745
0.9922 0.9922 0.9922 0.9922 0.9922 0.9922 0.9922 0.9843 0.3647 0.3216
0.9922 0.9922 0.9922 0.9922 0.7765 0.7137 0.9686 0.9451 0 0
0.4196 0.9922 0.9922 0.8039 0.0431 0 0.1686 0.6039 0 0
0.0039 0.6039 0.9922 0.3529 0 0 0 0 0 0
0 0.5451 0.9922 0.7451 0.0078 0 0 0 0 0
0 0.0431 0.7451 0.9922 0.2745 0 0 0 0 0
0 0 0.1373 0.9451 0.8824 0.6275 0.4235 0.0039 0 0
0 0 0 0.3176 0.9412 0.9922 0.9922 0.4667 0.0980 0
0 0 0 0 0.1765 0.7294 0.9922 0.9922 0.5882 0.1059
0 0 0 0 0 0.0627 0.3647 0.9882 0.9922 0.7333
0 0 0 0 0 0 0 0.9765 0.9922 0.9765
0 0 0 0 0.1804 0.5098 0.7176 0.9922 0.9922 0.8118
0 0 0.1529 0.5804 0.8980 0.9922 0.9922 0.9922 0.9804 0.7137
0.0941 0.4471 0.8667 0.9922 0.9922 0.9922 0.9922 0.7882 0.3059 0
0.8353 0.9922 0.9922 0.9922 0.9922 0.7765 0.3176 0.0078 0 0
0.9922 0.9922 0.9922 0.7647 0.3137 0.0353 0 0 0 0
0.9922 0.9569 0.5216 0.0431 0 0 0 0 0 0
0.5176 0.0627 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0

Columns 21 through 28

0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0.6510 1.0000 0.9686 0.4980 0 0 0 0
0.9922 0.9490 0.7647 0.2510 0 0 0 0
0.3216 0.2196 0.1529 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0.2510 0 0 0 0 0 0 0
0.0078 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0


My own dataset:

Columns 1 through 10

0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0.4000 0.5569
0 0 0 0 0 0 0 0 0.9961 0.9922
0 0 0 0 0 0 0 0 0.6745 0.9882
0 0 0 0 0 0 0 0 0.0824 0.8745
0 0 0 0 0 0 0 0 0 0.4784
0 0 0 0 0 0 0 0 0 0.4824
0 0 0 0 0 0 0 0 0.0824 0.8745
0 0 0 0 0 0 0 0.0824 0.8392 0.9922
0 0 0 0 0 0 0 0.2392 0.9922 0.6706
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0.4431 0.3608
0 0 0 0 0 0 0 0.3216 0.9922 0.5922
0 0 0 0 0 0 0 0.3216 1.0000 0.9922
0 0 0 0 0 0 0 0 0.2784 0.5922
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0

Columns 11 through 20

0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0.2000 0.5176 0.8392 0.9922 0.9961 0.9922 0.7961 0.6353
0.7961 0.7961 0.9922 0.9882 0.9922 0.9882 0.5922 0.2745 0 0
0.9569 0.7961 0.5569 0.4000 0.3216 0 0 0 0 0
0.7961 0 0 0 0 0 0 0 0 0
0.9176 0.1176 0 0 0 0 0 0 0 0
0.9922 0.1961 0 0 0 0 0 0 0 0
0.9961 0.3569 0.2000 0.2000 0.2000 0.0392 0 0 0 0
0.9922 0.9882 0.9922 0.9882 0.9922 0.6745 0.3216 0 0 0
0.7961 0.6353 0.4000 0.4000 0.7961 0.8745 0.9961 0.9922 0.2000 0.0392
0 0 0 0 0 0.0784 0.4392 0.7529 0.9922 0.8314
0 0 0 0 0 0 0 0 0.4000 0.7961
0 0 0 0 0 0 0 0 0 0.0784
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0.0824 0.4000 0.4000 0.7176
0.9176 0.5961 0.6000 0.7569 0.6784 0.9922 0.9961 0.9922 0.9961 0.8353
0.5922 0.9098 0.9922 0.8314 0.7529 0.5922 0.5137 0.1961 0.1961 0.0392
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0

Columns 21 through 28

0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0.1608 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0.1608 0 0 0 0 0 0 0
0.9176 0.2000 0 0 0 0 0 0
0.8353 0.9098 0.3216 0 0 0 0 0
0.2431 0.7961 0.9176 0.4392 0 0 0 0
0 0.0784 0.8353 0.9882 0 0 0 0
0 0 0.6000 0.9922 0 0 0 0
0 0.1608 0.9137 0.8314 0 0 0 0
0.1216 0.6784 0.9569 0.1569 0 0 0 0
0.9137 0.8314 0.3176 0 0 0 0 0
0.5569 0.0784 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0

Answer Source

So what you are looking for is a generalised way of normalising you test data so that it can be compared against the MNIST training data. Perhaps you could first use a technique to normalise the MNIST training data into a standard format, then train your CNN, then normalise you test data using the same process, then apply the CNN for recognition.

Have you seen this paper? It uses moment based image normalisation. It is word level, so not quite what you are doing, but should be easy enough to implement.

Moment-based Image Normalization for Handwritten Text Recognition (Kozielski et al.):

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download