Neural Network MNIST Number Recognition

Follow along on page, or grab the files from my github repo.

Feeding in REAL Data

Considering the test data showed less than 2% error rate I'm a bit disappointed. You can see it gets tripped up on some numbers.

Why Do This?

Here is a common theme in many python machine learning tutorials:

    
# python import machine learning data
from tensorflow.examples.tutorials.mnist import input_data
    
  

It's a python import that grabs the mnist data so we can feed it into the model. While convenient, it also takes the magic out. Maybe at the end the computer reports that our trained model had 98% accuracy, but so what? It has no context. I want to input my own data!

Saving Our Model

I'm going to specifically focus on using the Keras library in python. It rides on top of tensor flow and makes things pretty easy. The same principals will more or less apply if you choose to use something else, obviously you won't be able to use the exact same save model functions.

First we need to build our network. This is out of the scope of the tutorial, but the following sample code will do.

Note I've taken this from machinelearningmastery.com which describes this in detail. In short it's a Convolution Neural Network which has been designed to recognize hand written numbers. As I described above the actual training data is pulled in through a python import. Once this runs it will save the model to the files keras_mnist_cnn.json and keras_mnist_cnn.h5. The json file describes the neural network structure and the h5 file describes the values that have been learned by the neurons.

    
#!/usr/bin/env python3
#
# keras_mnist_cnn.py

import numpy
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Convolution2D
from keras.layers.convolutional import MaxPooling2D
from keras.utils import np_utils
from keras import backend as K
K.set_image_dim_ordering('th')

def main(iterations):
  seed = 7
  numpy.random.seed(seed)

  # load data
  (X_train, y_train), (X_test, y_test) = mnist.load_data()
  # reshape to be [samples][pixels][width][height]
  X_train = X_train.reshape(X_train.shape[0], 1, 28, 28).astype('float32')
  X_test = X_test.reshape(X_test.shape[0], 1, 28, 28).astype('float32')

  # normalize inputs from 0-255 to 0-1
  X_train = X_train / 255
  X_test = X_test / 255
  # one hot encode outputs
  y_train = np_utils.to_categorical(y_train)
  y_test = np_utils.to_categorical(y_test)
  num_classes = y_test.shape[1]

  # build the model
  model = larger_model(num_classes)
  # Fit the model
  model.fit(X_train, y_train, validation_data=(X_test, y_test), nb_epoch=iterations, batch_size=200, verbose=2)
  # Final evaluation of the model
  scores = model.evaluate(X_test, y_test, verbose=0)
  print("Baseline Error: %.2f%%" % (100-scores[1]*100))

  # serialize model to JSON
  model_json = model.to_json()
  with open("keras_mnist_cnn.json", "w") as json_file:
      json_file.write(model_json)
  # serialize weights to HDF5
  model.save_weights("keras_mnist_cnn.h5")
  print("Saved model to disk")

  return 0

def larger_model(num_classes):
  # create model
  model = Sequential()
  model.add(Convolution2D(30, 5, 5, border_mode='valid', input_shape=(1, 28, 28), activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2)))
  model.add(Convolution2D(15, 3, 3, activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2)))
  model.add(Dropout(0.2))
  model.add(Flatten())
  model.add(Dense(128, activation='relu'))
  model.add(Dense(50, activation='relu'))
  model.add(Dense(num_classes, activation='softmax'))
  # Compile model
  model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
  return model

iterations = 4
main(iterations)
    
  

This part here is what actually saves the model for later:

  
# serialize model to JSON
model_json = model.to_json()
with open("keras_mnist_cnn.json", "w") as json_file:
    json_file.write(model_json)
# serialize weights to HDF5
model.save_weights("keras_mnist_cnn.h5")
print("Saved model to disk")
  

Now save and run this once, check that the json and h5 files were created in the local directory.

Loading Our Model

This will load the model we just saved and then finally we can throw some of our own data at it:

    
#!/usr/bin/env python3
#
# keras_mnist_cnn_load_model_test.py

import numpy
import sys
from keras.models import model_from_json
import scipy.ndimage

def main(img_fname):
  seed = 7
  numpy.random.seed(seed)

  # load json and create model
  json_file = open('keras_mnist_cnn.json', 'r')
  loaded_model_json = json_file.read()
  json_file.close()
  loaded_model = model_from_json(loaded_model_json)
  # load weights into new model
  loaded_model.load_weights("keras_mnist_cnn.h5")
  print("Loaded model from disk")

  im = scipy.ndimage.imread(img_fname, flatten=True)
  data = im
  data = data.reshape(1, 1, 28, 28).astype('float32')
  data = data/255
  np_data = numpy.array(data)
  predict = loaded_model.predict(np_data)
  i = 0
  for x in predict[0]:
    if (round(x) > 0):
      print("")
      print("num is: %d" % (i))
      print("")
    i=i+1

fname = sys.argv[1]
main(fname)
    
  

Now create your own jpg image file of size 28x28 (GIMP works well) and place it in the same directory as the py files.

Call your program with the name of the file as the only argument and you'll see it try guess the number you've written.

$ python3 keras_mnist_cnn_load_model_test.py test.jpg

num is: 8

Two things are happening here, we load the model just like we saved it, pretty straightforward.

Then we read in the image and we actually just shape it the same way we did with the MNIST data and stuff it into a numpy array. After the data has been shaped we call a new Keras method, predict, and this will run the image array through the model. From that we get 10 outputs and since we are using one hot encoding we can treat each one as a binary off or on. We do this by rounding the outputs, so whichever index is 1 will be the prediction for the number we have passed to the model.