2.16. MLP model using Tensorflow - Keras#

After Building Neural Network (Multi Layer Perceptron model) from scratch using Numpy in Python (link to previous chapter), and after developing MLP using Pytorch (link to previous chapter), we will finally develop the MLP model using Tensorflow - Keras.

Note: The MLP model we developed from scratch almost follows the way the models are developed in Keras.

Import necessary libraries#

Here we import a Dense layer, an Activation layer and a Dropout layer. Then we will also import optimizers Adam and RMSprop.

Then we finally import the to_categorical function which is nothing but one hot vector function.

import numpy as np
import matplotlib.pyplot as plt # plotting library
%matplotlib inline

from keras.models import Sequential
from keras.layers import Dense , Activation, Dropout
from keras.optimizers import Adam ,RMSprop
from keras.utils import to_categorical

Data Loading and pre-processing#

Next we import and load the MNIST dataset

MNIST is a collection of handwritten digits ranging from the number 0 to 9.

It has a training set of 60,000 images, and 10,000 test images that are classified into corresponding categories or labels.

# import dataset
from keras.datasets import mnist

# load dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11490434/11490434 [==============================] - 0s 0us/step

After loading the MNIST dataset, the number of labels is computed as:

# compute the number of labels
num_labels = len(np.unique(y_train))

Now we will perform One hot vector encoding (link to previous chapter) on the target data

# convert to one-hot vector
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

Let us define our input_shape

input_size = x_train.shape[1] * x_train.shape[1]
input_size
784

Now we will resize and normalize the data

# resize and normalize
x_train = np.reshape(x_train, [-1, input_size])
x_train = x_train.astype('float32') / 255
x_test = np.reshape(x_test, [-1, input_size])
x_test = x_test.astype('float32') / 255

Now, we will set the network parameters as follows:

# network parameters
batch_size = 128
hidden_units = 256
dropout = 0.45

Model architecture#

The next step is to design the model architecture. The proposed model is made of three MLP layers.

In Keras, a Dense layer stands for the densely (fully) connected layer.

Our model is a 3-layer MLP with ReLU and dropout after each layer

model = Sequential()
model.add(Dense(hidden_units, input_dim=input_size))
model.add(Activation('relu'))
model.add(Dropout(dropout))
model.add(Dense(hidden_units))
model.add(Activation('relu'))
model.add(Dropout(dropout))
model.add(Dense(num_labels))
model.add(Activation('softmax'))

Keras library provides us summary() method to check the model description.

model.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense (Dense)               (None, 256)               200960    
                                                                 
 activation (Activation)     (None, 256)               0         
                                                                 
 dropout (Dropout)           (None, 256)               0         
                                                                 
 dense_1 (Dense)             (None, 256)               65792     
                                                                 
 activation_1 (Activation)   (None, 256)               0         
                                                                 
 dropout_1 (Dropout)         (None, 256)               0         
                                                                 
 dense_2 (Dense)             (None, 10)                2570      
                                                                 
 activation_2 (Activation)   (None, 10)                0         
                                                                 
=================================================================
Total params: 269,322
Trainable params: 269,322
Non-trainable params: 0
_________________________________________________________________

Executing the MLP model using Keras#

This section comprises of

  • Compiling the model with the compile() method.

  • Training the model with fit() method.

  • Evaluating the model performance with evaluate() method.

Compiling the model

model.compile(loss='categorical_crossentropy', 
              optimizer='adam',
              metrics=['accuracy'])

Training the model

model.fit(x_train, y_train, epochs=20, batch_size=batch_size)
Epoch 1/20
469/469 [==============================] - 6s 10ms/step - loss: 0.4237 - accuracy: 0.8715
Epoch 2/20
469/469 [==============================] - 4s 9ms/step - loss: 0.1963 - accuracy: 0.9409
Epoch 3/20
469/469 [==============================] - 4s 9ms/step - loss: 0.1513 - accuracy: 0.9543
Epoch 4/20
469/469 [==============================] - 4s 9ms/step - loss: 0.1286 - accuracy: 0.9611
Epoch 5/20
469/469 [==============================] - 4s 9ms/step - loss: 0.1150 - accuracy: 0.9651
Epoch 6/20
469/469 [==============================] - 5s 10ms/step - loss: 0.1034 - accuracy: 0.9687
Epoch 7/20
469/469 [==============================] - 5s 12ms/step - loss: 0.0938 - accuracy: 0.9713
Epoch 8/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0879 - accuracy: 0.9726
Epoch 9/20
469/469 [==============================] - 5s 11ms/step - loss: 0.0808 - accuracy: 0.9747
Epoch 10/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0775 - accuracy: 0.9752
Epoch 11/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0719 - accuracy: 0.9773
Epoch 12/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0680 - accuracy: 0.9785
Epoch 13/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0702 - accuracy: 0.9771
Epoch 14/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0642 - accuracy: 0.9797
Epoch 15/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0618 - accuracy: 0.9799
Epoch 16/20
469/469 [==============================] - 5s 10ms/step - loss: 0.0607 - accuracy: 0.9806
Epoch 17/20
469/469 [==============================] - 4s 10ms/step - loss: 0.0583 - accuracy: 0.9816
Epoch 18/20
469/469 [==============================] - 4s 10ms/step - loss: 0.0550 - accuracy: 0.9819
Epoch 19/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0549 - accuracy: 0.9825
Epoch 20/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0548 - accuracy: 0.9820
<keras.callbacks.History at 0x7f7f875dc4f0>

Evaluating model performance with evaluate() method

loss, acc = model.evaluate(x_test, y_test, batch_size=batch_size)
print("\nTest accuracy: %.1f%%" % (100.0 * acc))
79/79 [==============================] - 0s 4ms/step - loss: 0.0666 - accuracy: 0.9818

Test accuracy: 98.2%

We get the test accuracy of 98.2%. It is that simple!