3.4. 4 step process to build a CNN model using PyTorch
Contents
3.4. 4 step process to build a CNN model using PyTorch#
From our previous chapters (including the one where we have coded CNN model from scratch), we now have the idea of how CNN works. Today, we will build our very first CNN model using PyTorch (it just takes quite a few lines of code) in just 4 simple steps.
How to build CNN model using PyTorch#
Step-1#
Importing all dependencies
We first import torch
, which imports PyTorch. Then we import nn
, which allows us to define a neural network module.
Next we import the DataLoader
with the help of which we can feed data into the convolutional neural network (CNN) during training.
Finally we import transforms
, which allows us to perform data pre-processing (link to previous chapter)
import os
import torch
from torch import nn
from torchvision.datasets import CIFAR10
from torch.utils.data import DataLoader
from torchvision import transforms
# show the progress bar while priting
from tqdm import tqdm
Step-2#
Defining the CNN class as a nn.Module
The CNN class replicates the nn.Module
class. It has two definitions: init, or the constructor, and forward, which implements the forward pass.
We create a convolution model using nn.Conv2d
and a pooling layer using nn.Maxpool2d
.
Note: Here
nn.Linear
is similar to the Dense class we developed in our scratch model.
class CNN(nn.Module):
def __init__(self):
super().__init__()
'''
The first parameter 3 here represents that the image is colored and in RGB format.
If it was a grayscale image we would have gone for 1.
32 is the size of the initial output channel
'''
self.network = nn.Sequential(
nn.Conv2d(3, 32, kernel_size=3, padding=1),
nn.ReLU(),
nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.MaxPool2d(2, 2), # output: 64 x 16 x 16
nn.Flatten(),
nn.Linear(64*16*16, 512),
nn.ReLU(),
nn.Linear(512, 10)
)
def forward(self, x):
return self.network(x)
Step-3#
Preparing the CIFAR-10 dataset and compiling the model (loss function, and optimizer).
Note
CIFAR-10 is a dataset that has a collection of images of 10 different classes. This dataset is widely used for research purposes to test different machine learning models and especially for computer vision problems.
The next code we add involves preparing the CIFAR-10 dataset.
We will define the batch_size
of 100.
train_dataset = CIFAR10(os.getcwd(), download=True, transform=transforms.ToTensor(), train=True)
trainloader = torch.utils.data.DataLoader(train_dataset, batch_size=100, shuffle=True, num_workers=1)
test_dataset = CIFAR10(os.getcwd(), download=True, transform=transforms.ToTensor(), train=False)
testloader = torch.utils.data.DataLoader(test_dataset, batch_size=100, shuffle=True, num_workers=1)
Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to /content/cifar-10-python.tar.gz
Extracting /content/cifar-10-python.tar.gz to /content
Files already downloaded and verified
Now, we will initialize the CNN model and compile the same by specifying the loss function (categorical crossentropy loss) and the Adam optimizer.
# Initialize the CNN
cnn = CNN()
# Define the loss function (CrossEntropyLoss) and optimizer (ADAM)
loss_function = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(cnn.parameters(), lr=1e-4)
Step-4#
Defining the training loop
The core part of our runtime code is the training loop. In this loop, we perform the epochs, or training iterations. For every iteration, we iterate over the training dataset, perform the entire forward and backward passes, and perform model optimization.
# Run the training loop
# 2 epochs at maximum
epochs = 2
for epoch in range(0, epochs):
# Print epoch
print("Epoch:", epoch+1, '/', end=' ')
# Set current loss value
current_loss = 0.0
# Iterate over the DataLoader for training data
for i, data in enumerate(tqdm(trainloader)):
# Get inputs
inputs, targets = data
# Zero the gradients
optimizer.zero_grad()
# Perform forward pass
outputs = cnn(inputs)
# Compute loss
loss = loss_function(outputs, targets)
# Perform backward pass
loss.backward()
# Perform optimization
optimizer.step()
# Print results
current_loss += loss.item()
print("Training Loss:", current_loss/len(trainloader))
# Process is complete.
print('Training process has finished.')
Epoch: 1 /
100%|██████████| 500/500 [04:27<00:00, 1.87it/s]
Training Loss: 1.716611654281616
Epoch: 2 /
100%|██████████| 500/500 [05:01<00:00, 1.66it/s]
Training Loss: 1.3922613124847412
Training process has finished.
Testing time!#
cnn.eval()
correct = 0
total = 0
running_loss = 0.0
with torch.no_grad():
for i, data in enumerate(tqdm(testloader)):
inputs, targets = data
outputs = cnn(inputs)
loss = loss_function(outputs, targets)
_, predicted = torch.max(outputs.data, 1)
total += targets.size(0)
correct += (predicted == targets).sum().item()
running_loss = running_loss + loss.item()
accuracy = correct / total
running_loss = running_loss/len(testloader)
print("\nTest Loss:", running_loss)
print("Test Accuracy:", accuracy)
100%|██████████| 100/100 [00:19<00:00, 5.20it/s]
Test Loss: 1.3139171063899995
Test Accuracy: 0.5323
With more complex model, we can increase the accuracy of CIFAR-10 as much as we want. The main thing is that we have learnt how to build our very first CNN model using PyTorch in just 4 simple steps.