Optional Lecture

For the better understanding of the model, we provide another notebook where you can inspect the model convolutional filter and its output. The link to the colab for that notebook is here

Assignment

For the assignment of Week 2, Please visit this link and copy the notebook to your google drive.

After training the network for MNIST dataset is finised, please submit the weight file save at the last cell to this link.

Import Necessary Libraries

Pytorch is the open source machine learning framework that we can use for research to production.

If you would like to study the tutorials from the offical pytorch website, please visit this link. The source code for the entire pytorch framework can be found here.

SideNote:Pytorch separates different tasks into different modules. e.g., there is a package called torchaudio that focus only on audio alone.

Today we will use torchvision, which is the package inside torch, that focus on vision tasks. For example, for data augmentation, torchvision provides the function called transforms. And if you would like to use transfer learning, torchvision also provides some of the state of the art pretrained models.

The torch.nn contains the basic building blocks that we need to construct our model. For example, if we need to construct a Convolutional Layer, we can call the function torch.nn.Conv2d() to construct that layer.

And if we want linear layer that perform the equation of $y = W^TX + b$, we can call the function torch.nn.Linear().

If you would like to know more about torch.nn library, please visit this link for more information. Also, this post provides better understanding of torch.nn module.

import torch
import torchvision
import torchvision.transforms as transforms

import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

import matplotlib.pyplot as plt
import numpy as np

Main Components

Let's preview what will be built in this notebook.

Dataset
- cifar10
Model Architecture
- Simple Model
Loss (Update model)
- Cross Entropy
Optimizer (Regularizer)
- Adam Optimizer
Metrics (Visual for User)
- Loss, Accuracy
Save Model
- Model Checkpoint

If you would like to change from CPU to GPU, select the Runtime --> Change Runtime type and select GPU.

After done selecting, we can check whether we are running GPU or CPU by using the following function:

print(torch.cuda.is_available())

True

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print(device)

cuda:0

If the output show cuda:0, then the GPU is in used.

Dataset

In this notebook, we will use CIFAR10 dataset for classification.

CIFAR10 is a public classification dataset, which consists of 60,000 32 * 32 color images in 10 classes. There are 50,000 training images and 10,000 testing images.

In the following cell, we will do data augmentation for the dataset. When doing trasnforms on the test dataset, we only normalize the input, that is because we do not need to augment(change) our test dataset to see the result.

Tensor is just like Numpy's ndarray, except that it can do calculations on GPU, and is modified to fit the training neural network procedure.

Normalization equation:

$X = \frac{X-\mu}{\sigma}$.

There are many augmentation methods available, if you would like to know more, please visit this post.

class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck']


# Image Normalization, Data Augmentation
transform = transforms.Compose([transforms.ToTensor(),
                                transforms.Normalize((0.5,), (0.5,)),
                                transforms.RandomRotation(30)
                                ])

test_transform = transforms.Compose([
                                     transforms.ToTensor(),
                                     transforms.Normalize((0.5,), (0.5,))
])
train_dataset = torchvision.datasets.CIFAR10(
        './data', train=True,
        transform=transform, download=True)

test_dataset = torchvision.datasets.CIFAR10(
        './data', train=False,
        transform=test_transform, download=True
)

Files already downloaded and verified
Files already downloaded and verified

Note: Pytorch stores the image dataset in the following format:

batch_size * dim * height * width (B*D*H*W).

batch_size means number of images for one training iteration.

dim means the image dimension. Conventionally, color images have the dimension of 3 and gray scale images have the dimension of 1.

The label value for the dataset ranges from 0~9. e.g., if the label is 7, then, it would be class_names[7] = horse.

# Check the total number of images in train_dataset


# Plot training Image and print label

Dataset and DataLoader

Dataset plays the huge role in machine learning and deep learning. For the computer vision task, we can do data augmentation to get the effect of regulating the model, avoid being overfitting. And when training, the augmented dataset needs to be fetched by the data loader. The main duty of the dataloader is to prepare dataset before feeding into the neural network.

Dataloader object is a generator object, which can be accessed through iteration.

train_loader = torch.utils.data.DataLoader(train_dataset, 32, shuffle=True,
                                           num_workers=2)

test_loader = torch.utils.data.DataLoader(test_dataset, 8, shuffle=True,
                                           num_workers=2)

# Explain Generator

# Access train_loader

Model

Let's build a simple Convolutional Neural Network.

When flattening the network, we can use the equation:

$O = \frac{W-K+2P}{S} + 1$

class SampleModel(nn.Module):
    def __init__(self):
        super(SampleModel, self).__init__()
        self.conv1 = nn.Conv2d(3, 64, 3)
        self.conv2 = nn.Conv2d(64, 256, 3)
        self.conv3 = nn.Conv2d(256, 256, 3)
        self.conv4 = nn.Conv2d(256, 128, 3)
        self.fc1 = nn.Linear(24*24*128, 512)
        self.fc2 = nn.Linear(512, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.relu(self.conv2(x))
        x = F.relu(self.conv3(x))
        x = F.relu(self.conv4(x))
        x = x.view(-1, 24*24*128)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

model = SampleModel()

# Move model to GPU
model.to(device)


# Define Loss
criterion = nn.CrossEntropyLoss()

# Define Optimizer
optimizer = optim.Adam(model.parameters())
# optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)

# Test Model

EPOCH = 2

loss_log = []
acc_log = []
x_coor = []

for epoch in range(EPOCH):
    loss_epoch = 0
    total_imgs = 0
    correct_epoch = 0

    for i, data in enumerate(train_loader):
        
        # Get Image data and Label data
        imgs, labels = data[0].to(device), data[1].to(device)

        # Clear out the gradient
        optimizer.zero_grad()


        # Forward Propagation Start
        # Predict from Input : B * 10
        predicts = model(imgs)

        # Calculate Loss from batch : 1
        loss = criterion(predicts, labels)
        # Forward Propagation End


        # Backward Propagation Start
        # Calculate Gradient
        loss.backward()

        # Update Model parameters with optimizer : Adam or SGD
        optimizer.step()
        # Backward Propagation End


        # Add to Epoch Loss
        loss_epoch += loss.item()
        # Total Number of images in one batch
        total_imgs += len(imgs)
        # Count the total number of correct prediction
        correct_batch = (torch.argmax(predicts, 1)==labels).sum().item()
        correct_epoch += correct_batch
        acc_batch = correct_batch/ len(imgs)

        # Adding to tensorboard
        x_coor.append((i*len(imgs))+(epoch*len(train_dataset)))
        loss_log.append(loss.item())
        acc_log.append(acc_batch)


    acc_epoch = (correct_epoch/total_imgs)*100
    loss_epoch = loss_epoch/total_imgs
    print(f"EPOCH : {epoch+1}, Acc : {acc_epoch:.2f}, Loss : {loss_epoch:.2f}")

EPOCH : 1, Acc : 40.23, Loss : 0.05
EPOCH : 2, Acc : 53.20, Loss : 0.04

plt.plot(x_coor, loss_log, label = 'training_loss')
plt.legend()
plt.show()
plt.plot(x_coor, acc_log, label = 'training_acc')
plt.legend()
plt.show()

Tensorboard

Tensorboard provides the visualization and tooling needed for machine learning experimentation.
After doing research or wanting to put the model to production, we want graph showing the performance result of our model. Tensorboard offer logging the training loss and accuracy, visualizing the model and much more. You could also visit this link for more information.

model = SampleModel().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters())

# Import tensorboard library
from torch.utils.tensorboard import SummaryWriter

EPOCH = 2

# This create a folder where the logging will be stored in.
writer = SummaryWriter('log')

for epoch in range(EPOCH):
    loss_epoch = 0
    total_imgs = 0
    correct_epoch = 0

    for i, data in enumerate(train_loader):
        
        # Get Image data and Label data
        imgs, labels = data[0].to(device), data[1].to(device)

        # Clear out the gradient
        optimizer.zero_grad()


        # Forward Propagation Start
        # Predict from Input : B * 10
        predicts = model(imgs)

        # Calculate Loss from batch : 1
        loss = criterion(predicts, labels)
        # Forward Propagation End


        # Backward Propagation Start
        # Calculate Gradient
        loss.backward()

        # Update Model parameters with optimizer : Adam or SGD
        optimizer.step()
        # Backward Propagation End


        # Add to Epoch Loss
        loss_epoch += loss.item()
        # Total Number of images in one batch
        total_imgs += len(imgs)
        # Count the total number of correct prediction
        correct_batch = (torch.argmax(predicts, 1)==labels).sum().item()
        correct_epoch += correct_batch
        acc_batch = correct_batch/ len(imgs)

        # Adding to tensorboard
        writer.add_scalar('Loss/Train', loss.item(), (i*len(imgs))+(epoch*len(train_dataset)))
        writer.add_scalar('Accuracy/Train', acc_batch, (i*len(imgs))+(epoch*len(train_dataset)))
    
    acc_epoch = (correct_epoch/total_imgs)*100
    loss_epoch = loss_epoch/total_imgs
    print(f"EPOCH : {epoch+1}, Acc : {acc_epoch:.2f}, Loss : {loss_epoch:.2f}")

writer.add_graph(model, imgs)

EPOCH : 1, Acc : 35.37, Loss : 0.06
EPOCH : 2, Acc : 49.79, Loss : 0.04

%load_ext tensorboard
%tensorboard --logdir log

Output hidden; open in https://colab.research.google.com to view.

Hyperparameter Tunning

Let's do a simple tunning process where we will look at how learning rate will effect the model performace.

Below, we select three learning rate value : 0.1, 0.01 and 0.001, and plot their respective result on the tensorboard.

from torch.utils.tensorboard import SummaryWriter

EPOCH = 2
learning_rate = [0.1, 0.01, 0.001]


for lr in learning_rate:
    # Let's restart the training
    model = SampleModel().to(device)
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=lr)
    writer = SummaryWriter(comment = f"lr_{lr}")
    for epoch in range(EPOCH):
        loss_epoch = 0
        total_imgs = 0
        correct_epoch = 0

        for i, data in enumerate(train_loader):
            
            # Get Image data and Label data
            imgs, labels = data[0].to(device), data[1].to(device)

            # Clear out the gradient
            optimizer.zero_grad()


            # Forward Propagation Start
            # Predict from Input : B * 10
            predicts = model(imgs)

            # Calculate Loss from batch : 1
            loss = criterion(predicts, labels)
            # Forward Propagation End


            # Backward Propagation Start
            # Calculate Gradient
            loss.backward()

            # Update Model parameters with optimizer : Adam or SGD
            optimizer.step()
            # Backward Propagation End


            # Add to Epoch Loss
            loss_epoch += loss.item()
            # Total Number of images in one batch
            total_imgs += len(imgs)
            # Count the total number of correct prediction
            correct_batch = (torch.argmax(predicts, 1)==labels).sum().item()
            correct_epoch += correct_batch
            acc_batch = correct_batch/ len(imgs)

            # Adding to tensorboard
            writer.add_scalar('Loss/Train', loss.item(), (i*len(imgs))+(epoch*len(train_dataset)))
            writer.add_scalar('Accuracy/Train', acc_batch, (i*len(imgs))+(epoch*len(train_dataset)))
        
        acc_epoch = (correct_epoch/total_imgs)*100
        loss_epoch = loss_epoch/total_imgs
        print(f"LR : {lr}, EPOCH : {epoch+1}, Acc : {acc_epoch:.2f}, Loss : {loss_epoch:.2f}")

EPOCH : 1, Acc : 9.87, Loss : 8932.46
EPOCH : 2, Acc : 9.89, Loss : 0.07
EPOCH : 1, Acc : 10.12, Loss : 0.28
EPOCH : 2, Acc : 9.88, Loss : 0.07
EPOCH : 1, Acc : 42.73, Loss : 0.05
EPOCH : 2, Acc : 55.18, Loss : 0.04

%load_ext tensorboard
%tensorboard --logdir runs

Output hidden; open in https://colab.research.google.com to view.

!zip -r /content/runs.zip /content/runs

Saving Model

ckpt_path = 'checkpoint.pt'

# State Difference between model.parameters and model.state_dict()
torch.save(model.state_dict(), ckpt_path)

model = SampleModel().to(device)
checkpoint = torch.load(ckpt_path)
model.load_state_dict(checkpoint)

<All keys matched successfully>

!cp

cp: missing file operand
Try 'cp --help' for more information.

num_total = 0
num_correct = 0

# Deactivate Drop out and Batch-Normalization layers.
model.eval()

# Do not store gradient info in forward propagation.
with torch.no_grad():
    for i, data in enumerate(test_loader):
        image, label = data
        image = image.to(device)
        label = label.to(device)

        predict = model(image)
        predict_class = torch.argmax(predict, dim=1)
        correct = (predict_class == label)


        num_correct += correct.sum().item()
        num_total += len(image)
        
print(num_correct, num_total)

6134 10000

print("Accuracy : ", num_correct/num_total)

Accuracy :  0.6134

def test_model(model):
    data = next(iter(test_loader))
    imgs, labels = data[0].to(device), data[1]
    predicts = model(imgs)
    print(predicts.size())
    index = torch.argmax(predicts, dim=1)
    titles = []
    for i in index:
        titles.append(class_names[i])
    plt.figure(figsize=(20,10))
    for i in range(len(titles)):
        title = f"Predict : {titles[i]}, Actual : {class_names[labels[i]]}"
        if titles[i]==class_names[labels[i]]:color = 'blue'
        else:color='red'
        img = (imgs[i].cpu().numpy()/2)+0.5
        plt.subplot(2, 4, i+1)
        plt.imshow(img.transpose(1, 2, 0));plt.title(title, fontdict={'fontsize':17, 'color':color})

test_model(model)

torch.Size([8, 10])

# !conda install tensorboard

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

# %load_ext tensorboard
# %tensorboard --logdir runs

# print(len(train_dataset))

# # img_ind ~ 0~4,999
# # 
# img_ind = 60
# image = train_dataset[img_ind][0].numpy()
# label = train_dataset[img_ind][1]

# image = image.transpose(1, 2, 0)
# print(class_names[label])
# plt.imshow(image)
# plt.show()

# # Explain Generator

# # Access train_loader
# for data in train_loader:
#     print(data[0].size())
#     print(data[1].size())
#     break

# # Test Model
# input_images = torch.rand(1, 3, 32, 32).to(device)
# prediction = model(input_images)
# prediction