Week 2, Day 3 (Guided Project with Pytorch on Image Classification)
Welcome to third day (Week 2) of the McE-51069 course. We will walk you through with the basic flow of building the Convolutional Neural Network.
Optional Lecture
For the better understanding of the model, we provide another notebook where you can inspect the model convolutional filter and its output. The link to the colab for that notebook is here
Import Necessary Libraries
Pytorch is the open source machine learning framework that we can use for research to production.
If you would like to study the tutorials from the offical pytorch website, please visit this link. The source code for the entire pytorch framework can be found here.
SideNote:Pytorch separates different tasks into different modules. e.g., there is a package called torchaudio that focus only on audio alone.
Today we will use torchvision
, which is the package inside torch, that focus on vision tasks. For example, for data augmentation, torchvision provides the function called transforms. And if you would like to use transfer learning, torchvision also provides some of the state of the art pretrained models.
The torch.nn
contains the basic building blocks that we need to construct our model. For example, if we need to construct a Convolutional Layer, we can call the function torch.nn.Conv2d()
to construct that layer.
And if we want linear layer that perform the equation of $y = W^TX + b$, we can call the function torch.nn.Linear()
.
If you would like to know more about torch.nn
library, please visit this link for more information. Also, this post provides better understanding of torch.nn
module.
import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import matplotlib.pyplot as plt
import numpy as np
If you would like to change from CPU to GPU,
select the Runtime --> Change Runtime type
and select GPU
.
After done selecting, we can check whether we are running GPU or CPU by using the following function:
print(torch.cuda.is_available())
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print(device)
If the output show cuda:0
, then the GPU
is in used.
Dataset
In this notebook, we will use CIFAR10
dataset for classification.
CIFAR10 is a public classification dataset, which consists of 60,000 32 * 32 color images in 10 classes. There are 50,000 training images and 10,000 testing images.
In the following cell, we will do data augmentation for the dataset. When doing trasnforms on the test dataset, we only normalize the input, that is because we do not need to augment(change) our test dataset to see the result.
Tensor is just like Numpy's ndarray, except that it can do calculations on GPU, and is modified to fit the training neural network procedure.
Normalization equation:
$X = \frac{X-\mu}{\sigma}$.
There are many augmentation methods available, if you would like to know more, please visit this post.
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
'dog', 'frog', 'horse', 'ship', 'truck']
# Image Normalization, Data Augmentation
transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,)),
transforms.RandomRotation(30)
])
test_transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
train_dataset = torchvision.datasets.CIFAR10(
'./data', train=True,
transform=transform, download=True)
test_dataset = torchvision.datasets.CIFAR10(
'./data', train=False,
transform=test_transform, download=True
)
batch_size * dim * height * width (B*D*H*W)
.
batch_size
means number of images for one training iteration.
dim
means the image dimension. Conventionally, color images have the dimension of 3 and gray scale images have the dimension of 1.
The label value for the dataset ranges from 0~9
. e.g., if the label is 7, then, it would be class_names[7] = horse.
# Check the total number of images in train_dataset
# Plot training Image and print label
Dataset and DataLoader
Dataset plays the huge role in machine learning and deep learning. For the computer vision task, we can do data augmentation to get the effect of regulating the model, avoid being overfitting. And when training, the augmented dataset needs to be fetched by the data loader. The main duty of the dataloader is to prepare dataset before feeding into the neural network.
Dataloader object is a generator object, which can be accessed through iteration.
train_loader = torch.utils.data.DataLoader(train_dataset, 32, shuffle=True,
num_workers=2)
test_loader = torch.utils.data.DataLoader(test_dataset, 8, shuffle=True,
num_workers=2)
# Explain Generator
# Access train_loader
class SampleModel(nn.Module):
def __init__(self):
super(SampleModel, self).__init__()
self.conv1 = nn.Conv2d(3, 64, 3)
self.conv2 = nn.Conv2d(64, 256, 3)
self.conv3 = nn.Conv2d(256, 256, 3)
self.conv4 = nn.Conv2d(256, 128, 3)
self.fc1 = nn.Linear(24*24*128, 512)
self.fc2 = nn.Linear(512, 10)
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.relu(self.conv2(x))
x = F.relu(self.conv3(x))
x = F.relu(self.conv4(x))
x = x.view(-1, 24*24*128)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
model = SampleModel()
# Move model to GPU
model.to(device)
# Define Loss
criterion = nn.CrossEntropyLoss()
# Define Optimizer
optimizer = optim.Adam(model.parameters())
# optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
# Test Model
EPOCH = 2
loss_log = []
acc_log = []
x_coor = []
for epoch in range(EPOCH):
loss_epoch = 0
total_imgs = 0
correct_epoch = 0
for i, data in enumerate(train_loader):
# Get Image data and Label data
imgs, labels = data[0].to(device), data[1].to(device)
# Clear out the gradient
optimizer.zero_grad()
# Forward Propagation Start
# Predict from Input : B * 10
predicts = model(imgs)
# Calculate Loss from batch : 1
loss = criterion(predicts, labels)
# Forward Propagation End
# Backward Propagation Start
# Calculate Gradient
loss.backward()
# Update Model parameters with optimizer : Adam or SGD
optimizer.step()
# Backward Propagation End
# Add to Epoch Loss
loss_epoch += loss.item()
# Total Number of images in one batch
total_imgs += len(imgs)
# Count the total number of correct prediction
correct_batch = (torch.argmax(predicts, 1)==labels).sum().item()
correct_epoch += correct_batch
acc_batch = correct_batch/ len(imgs)
# Adding to tensorboard
x_coor.append((i*len(imgs))+(epoch*len(train_dataset)))
loss_log.append(loss.item())
acc_log.append(acc_batch)
acc_epoch = (correct_epoch/total_imgs)*100
loss_epoch = loss_epoch/total_imgs
print(f"EPOCH : {epoch+1}, Acc : {acc_epoch:.2f}, Loss : {loss_epoch:.2f}")
plt.plot(x_coor, loss_log, label = 'training_loss')
plt.legend()
plt.show()
plt.plot(x_coor, acc_log, label = 'training_acc')
plt.legend()
plt.show()
Tensorboard
Tensorboard provides the visualization and tooling needed for machine learning experimentation.
After doing research or wanting to put the model to production, we want graph showing the performance result of our model. Tensorboard offer logging the training loss and accuracy, visualizing the model and much more. You could also visit this link for more information.
model = SampleModel().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters())
# Import tensorboard library
from torch.utils.tensorboard import SummaryWriter
EPOCH = 2
# This create a folder where the logging will be stored in.
writer = SummaryWriter('log')
for epoch in range(EPOCH):
loss_epoch = 0
total_imgs = 0
correct_epoch = 0
for i, data in enumerate(train_loader):
# Get Image data and Label data
imgs, labels = data[0].to(device), data[1].to(device)
# Clear out the gradient
optimizer.zero_grad()
# Forward Propagation Start
# Predict from Input : B * 10
predicts = model(imgs)
# Calculate Loss from batch : 1
loss = criterion(predicts, labels)
# Forward Propagation End
# Backward Propagation Start
# Calculate Gradient
loss.backward()
# Update Model parameters with optimizer : Adam or SGD
optimizer.step()
# Backward Propagation End
# Add to Epoch Loss
loss_epoch += loss.item()
# Total Number of images in one batch
total_imgs += len(imgs)
# Count the total number of correct prediction
correct_batch = (torch.argmax(predicts, 1)==labels).sum().item()
correct_epoch += correct_batch
acc_batch = correct_batch/ len(imgs)
# Adding to tensorboard
writer.add_scalar('Loss/Train', loss.item(), (i*len(imgs))+(epoch*len(train_dataset)))
writer.add_scalar('Accuracy/Train', acc_batch, (i*len(imgs))+(epoch*len(train_dataset)))
acc_epoch = (correct_epoch/total_imgs)*100
loss_epoch = loss_epoch/total_imgs
print(f"EPOCH : {epoch+1}, Acc : {acc_epoch:.2f}, Loss : {loss_epoch:.2f}")
writer.add_graph(model, imgs)
%load_ext tensorboard
%tensorboard --logdir log
from torch.utils.tensorboard import SummaryWriter
EPOCH = 2
learning_rate = [0.1, 0.01, 0.001]
for lr in learning_rate:
# Let's restart the training
model = SampleModel().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=lr)
writer = SummaryWriter(comment = f"lr_{lr}")
for epoch in range(EPOCH):
loss_epoch = 0
total_imgs = 0
correct_epoch = 0
for i, data in enumerate(train_loader):
# Get Image data and Label data
imgs, labels = data[0].to(device), data[1].to(device)
# Clear out the gradient
optimizer.zero_grad()
# Forward Propagation Start
# Predict from Input : B * 10
predicts = model(imgs)
# Calculate Loss from batch : 1
loss = criterion(predicts, labels)
# Forward Propagation End
# Backward Propagation Start
# Calculate Gradient
loss.backward()
# Update Model parameters with optimizer : Adam or SGD
optimizer.step()
# Backward Propagation End
# Add to Epoch Loss
loss_epoch += loss.item()
# Total Number of images in one batch
total_imgs += len(imgs)
# Count the total number of correct prediction
correct_batch = (torch.argmax(predicts, 1)==labels).sum().item()
correct_epoch += correct_batch
acc_batch = correct_batch/ len(imgs)
# Adding to tensorboard
writer.add_scalar('Loss/Train', loss.item(), (i*len(imgs))+(epoch*len(train_dataset)))
writer.add_scalar('Accuracy/Train', acc_batch, (i*len(imgs))+(epoch*len(train_dataset)))
acc_epoch = (correct_epoch/total_imgs)*100
loss_epoch = loss_epoch/total_imgs
print(f"LR : {lr}, EPOCH : {epoch+1}, Acc : {acc_epoch:.2f}, Loss : {loss_epoch:.2f}")
%load_ext tensorboard
%tensorboard --logdir runs
!zip -r /content/runs.zip /content/runs
ckpt_path = 'checkpoint.pt'
# State Difference between model.parameters and model.state_dict()
torch.save(model.state_dict(), ckpt_path)
model = SampleModel().to(device)
checkpoint = torch.load(ckpt_path)
model.load_state_dict(checkpoint)
!cp
num_total = 0
num_correct = 0
# Deactivate Drop out and Batch-Normalization layers.
model.eval()
# Do not store gradient info in forward propagation.
with torch.no_grad():
for i, data in enumerate(test_loader):
image, label = data
image = image.to(device)
label = label.to(device)
predict = model(image)
predict_class = torch.argmax(predict, dim=1)
correct = (predict_class == label)
num_correct += correct.sum().item()
num_total += len(image)
print(num_correct, num_total)
print("Accuracy : ", num_correct/num_total)
def test_model(model):
data = next(iter(test_loader))
imgs, labels = data[0].to(device), data[1]
predicts = model(imgs)
print(predicts.size())
index = torch.argmax(predicts, dim=1)
titles = []
for i in index:
titles.append(class_names[i])
plt.figure(figsize=(20,10))
for i in range(len(titles)):
title = f"Predict : {titles[i]}, Actual : {class_names[labels[i]]}"
if titles[i]==class_names[labels[i]]:color = 'blue'
else:color='red'
img = (imgs[i].cpu().numpy()/2)+0.5
plt.subplot(2, 4, i+1)
plt.imshow(img.transpose(1, 2, 0));plt.title(title, fontdict={'fontsize':17, 'color':color})
test_model(model)
# !conda install tensorboard
# %load_ext tensorboard
# %tensorboard --logdir runs
# print(len(train_dataset))
# # img_ind ~ 0~4,999
# #
# img_ind = 60
# image = train_dataset[img_ind][0].numpy()
# label = train_dataset[img_ind][1]
# image = image.transpose(1, 2, 0)
# print(class_names[label])
# plt.imshow(image)
# plt.show()
# # Explain Generator
# # Access train_loader
# for data in train_loader:
# print(data[0].size())
# print(data[1].size())
# break
# # Test Model
# input_images = torch.rand(1, 3, 32, 32).to(device)
# prediction = model(input_images)
# prediction