2.16. MLP model using Tensorflow - Keras
Contents
2.16. MLP model using Tensorflow - Keras#
After Building Neural Network (Multi Layer Perceptron model) from scratch using Numpy in Python (link to previous chapter), and after developing MLP using Pytorch (link to previous chapter), we will finally develop the MLP model using Tensorflow - Keras.
Note: The MLP model we developed from scratch almost follows the way the models are developed in Keras.
Import necessary libraries#
Here we import a Dense
layer, an Activation
layer and a Dropout
layer. Then we will also import optimizers Adam
and RMSprop
.
Then we finally import the to_categorical
function which is nothing but one hot vector function.
import numpy as np
import matplotlib.pyplot as plt # plotting library
%matplotlib inline
from keras.models import Sequential
from keras.layers import Dense , Activation, Dropout
from keras.optimizers import Adam ,RMSprop
from keras.utils import to_categorical
Data Loading and pre-processing#
Next we import and load the MNIST dataset
MNIST is a collection of handwritten digits ranging from the number 0 to 9.
It has a training set of 60,000 images, and 10,000 test images that are classified into corresponding categories or labels.
# import dataset
from keras.datasets import mnist
# load dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11490434/11490434 [==============================] - 0s 0us/step
After loading the MNIST dataset, the number of labels is computed as:
# compute the number of labels
num_labels = len(np.unique(y_train))
Now we will perform One hot vector encoding (link to previous chapter) on the target data
# convert to one-hot vector
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
Let us define our input_shape
input_size = x_train.shape[1] * x_train.shape[1]
input_size
784
Now we will resize and normalize the data
# resize and normalize
x_train = np.reshape(x_train, [-1, input_size])
x_train = x_train.astype('float32') / 255
x_test = np.reshape(x_test, [-1, input_size])
x_test = x_test.astype('float32') / 255
Now, we will set the network parameters as follows:
# network parameters
batch_size = 128
hidden_units = 256
dropout = 0.45
Model architecture#
The next step is to design the model architecture. The proposed model is made of three MLP layers.
In Keras, a Dense layer stands for the densely (fully) connected layer.
Our model is a 3-layer MLP with ReLU and dropout after each layer
model = Sequential()
model.add(Dense(hidden_units, input_dim=input_size))
model.add(Activation('relu'))
model.add(Dropout(dropout))
model.add(Dense(hidden_units))
model.add(Activation('relu'))
model.add(Dropout(dropout))
model.add(Dense(num_labels))
model.add(Activation('softmax'))
Keras library provides us summary() method to check the model description.
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 256) 200960
activation (Activation) (None, 256) 0
dropout (Dropout) (None, 256) 0
dense_1 (Dense) (None, 256) 65792
activation_1 (Activation) (None, 256) 0
dropout_1 (Dropout) (None, 256) 0
dense_2 (Dense) (None, 10) 2570
activation_2 (Activation) (None, 10) 0
=================================================================
Total params: 269,322
Trainable params: 269,322
Non-trainable params: 0
_________________________________________________________________
Executing the MLP model using Keras#
This section comprises of
Compiling the model with the compile() method.
Training the model with fit() method.
Evaluating the model performance with evaluate() method.
Compiling the model
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
Training the model
model.fit(x_train, y_train, epochs=20, batch_size=batch_size)
Epoch 1/20
469/469 [==============================] - 6s 10ms/step - loss: 0.4237 - accuracy: 0.8715
Epoch 2/20
469/469 [==============================] - 4s 9ms/step - loss: 0.1963 - accuracy: 0.9409
Epoch 3/20
469/469 [==============================] - 4s 9ms/step - loss: 0.1513 - accuracy: 0.9543
Epoch 4/20
469/469 [==============================] - 4s 9ms/step - loss: 0.1286 - accuracy: 0.9611
Epoch 5/20
469/469 [==============================] - 4s 9ms/step - loss: 0.1150 - accuracy: 0.9651
Epoch 6/20
469/469 [==============================] - 5s 10ms/step - loss: 0.1034 - accuracy: 0.9687
Epoch 7/20
469/469 [==============================] - 5s 12ms/step - loss: 0.0938 - accuracy: 0.9713
Epoch 8/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0879 - accuracy: 0.9726
Epoch 9/20
469/469 [==============================] - 5s 11ms/step - loss: 0.0808 - accuracy: 0.9747
Epoch 10/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0775 - accuracy: 0.9752
Epoch 11/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0719 - accuracy: 0.9773
Epoch 12/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0680 - accuracy: 0.9785
Epoch 13/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0702 - accuracy: 0.9771
Epoch 14/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0642 - accuracy: 0.9797
Epoch 15/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0618 - accuracy: 0.9799
Epoch 16/20
469/469 [==============================] - 5s 10ms/step - loss: 0.0607 - accuracy: 0.9806
Epoch 17/20
469/469 [==============================] - 4s 10ms/step - loss: 0.0583 - accuracy: 0.9816
Epoch 18/20
469/469 [==============================] - 4s 10ms/step - loss: 0.0550 - accuracy: 0.9819
Epoch 19/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0549 - accuracy: 0.9825
Epoch 20/20
469/469 [==============================] - 4s 9ms/step - loss: 0.0548 - accuracy: 0.9820
<keras.callbacks.History at 0x7f7f875dc4f0>
Evaluating model performance with evaluate() method
loss, acc = model.evaluate(x_test, y_test, batch_size=batch_size)
print("\nTest accuracy: %.1f%%" % (100.0 * acc))
79/79 [==============================] - 0s 4ms/step - loss: 0.0666 - accuracy: 0.9818
Test accuracy: 98.2%
We get the test accuracy of 98.2%. It is that simple!