This notebook will distinguish betwwen two handwritten digits based on training from MNIST database.

Importing all the necessary libraries
%load_ext autoreload
%autoreload 2
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
!pip install mnist
Requirement already satisfied: mnist in c:\anaconda3\lib\site-packages (0.2.2)
Requirement already satisfied: numpy in c:\anaconda3\lib\site-packages (from mnist) (1.18.1)
Preparing the Data
import mnist
train_images = mnist.train_images()
train_labels = mnist.train_labels()
train_images.shape, train_labels.shape
((60000, 28, 28), (60000,))
test_images = mnist.test_images()
test_labels = mnist.test_labels()
test_images.shape, test_labels.shape
((10000, 28, 28), (10000,))
image_index = 7776 # You may select anything up to 60,000
print(train_labels[image_index]) 
plt.imshow(train_images[image_index], cmap='Greys')
2
<matplotlib.image.AxesImage at 0x1dd4b0c9c88>

Filter data to get 3 and 8 out

train_filter = np.where((train_labels == 3 ) | (train_labels == 8))
test_filter = np.where((test_labels == 3) | (test_labels == 8))
X_train, y_train = train_images[train_filter], train_labels[train_filter]
X_test, y_test = test_images[test_filter], test_labels[test_filter]

We normalize the pizel values in the 0 to 1 range

X_train = X_train/255.
X_test = X_test/255.

And setup the labels as 1 (when the digit is 3) and 0 (when the digit is 8)

y_train = 1*(y_train==3)
y_test = 1*(y_test==3)
X_train.shape, X_test.shape
((11982, 28, 28), (1984, 28, 28))

We reshape the data to flatten the image pixels into a set of features or co-variates:

X_train = X_train.reshape(X_train.shape[0], -1)
X_test = X_test.reshape(X_test.shape[0], -1)
X_train.shape, X_test.shape
((11982, 784), (1984, 784))
#libraries
from kudzu.data import Data, Dataloader, Sampler
from kudzu.callbacks import AccCallback
from kudzu.loss import MSE
from kudzu.layer import Affine, Sigmoid
from kudzu.model import Model
from kudzu.optim import GD
from kudzu.train import Learner
from kudzu.callbacks import ClfCallback
from kudzu.layer import Sigmoid
from kudzu.layer import Relu
class Config:
    pass
config = Config()
config.lr = 0.001
config.num_epochs = 250
config.bs = 50
#data initialization
data = Data(X_train, y_train.reshape(-1,1))
loss = MSE()
opt = GD(config.lr)
sampler = Sampler(data, config.bs, shuffle=True)
dl = Dataloader(data, sampler)
#containers
training_xdata = X_train
testing_xdata = X_test
training_ydata = y_train.reshape(-1,1)
testing_ydata = y_test.reshape(-1,1)
#NN model initialization
layers = [Affine("first", 784, 100), Relu("first"), Affine("second", 100, 100), Relu("second"), Affine("third", 100, 2), Affine("final", 2, 1), Sigmoid("final")]

model_neural = Model(layers)
model_logistic = Model([Affine("logits", 784, 1), Sigmoid("sigmoid")])

learner1 = Learner(loss, model_neural, opt, config.num_epochs)
acc1 = ClfCallback(learner1, config.bs, training_xdata , testing_xdata, training_ydata, testing_ydata)
learner1.set_callbacks([acc1])
#learner call
learner1.train_loop(dl)
Epoch 0 Loss 0.2736260500679626
train accuracy is: 0.4356534802203305, test accuracy is 0.4596774193548387
Epoch 10 Loss 0.10898970213675427
train accuracy is: 0.9128693039559339, test accuracy is 0.9269153225806451
Epoch 20 Loss 0.061438693976068555
train accuracy is: 0.9376564847270906, test accuracy is 0.9465725806451613
Epoch 30 Loss 0.046799196765135075
train accuracy is: 0.9505090969788015, test accuracy is 0.9561491935483871
Epoch 40 Loss 0.039874034691351735
train accuracy is: 0.9554331497245869, test accuracy is 0.9642137096774194
Epoch 50 Loss 0.03583555451240939
train accuracy is: 0.9591887831747622, test accuracy is 0.9657258064516129
Epoch 60 Loss 0.033149081283470806
train accuracy is: 0.9621932899349024, test accuracy is 0.9657258064516129
Epoch 70 Loss 0.031171271947087378
train accuracy is: 0.9640293773994325, test accuracy is 0.9672379032258065
Epoch 80 Loss 0.029622195845065855
train accuracy is: 0.9657820063428476, test accuracy is 0.9672379032258065
Epoch 90 Loss 0.028363506333614463
train accuracy is: 0.9678684693707228, test accuracy is 0.9682459677419355
Epoch 100 Loss 0.02730636679153363
train accuracy is: 0.9690368886663329, test accuracy is 0.9702620967741935
Epoch 110 Loss 0.02639704377322129
train accuracy is: 0.9694541812719079, test accuracy is 0.9722782258064516
Epoch 120 Loss 0.025596392894720783
train accuracy is: 0.970288766483058, test accuracy is 0.9717741935483871
Epoch 130 Loss 0.024883179582617758
train accuracy is: 0.971457185778668, test accuracy is 0.9712701612903226
Epoch 140 Loss 0.02423449780813126
train accuracy is: 0.972291770989818, test accuracy is 0.9712701612903226
Epoch 150 Loss 0.023642400132849118
train accuracy is: 0.9727090635953931, test accuracy is 0.9712701612903226
Epoch 160 Loss 0.02309271487127649
train accuracy is: 0.9737940243698882, test accuracy is 0.9707661290322581
Epoch 170 Loss 0.022580064438461795
train accuracy is: 0.9738774828910032, test accuracy is 0.9717741935483871
Epoch 180 Loss 0.022087778374958667
train accuracy is: 0.9746286095810383, test accuracy is 0.9727822580645161
Epoch 190 Loss 0.021614784251382343
train accuracy is: 0.9756301118344183, test accuracy is 0.9732862903225806
Epoch 200 Loss 0.02118366949591829
train accuracy is: 0.9761308629611083, test accuracy is 0.9737903225806451
Epoch 210 Loss 0.020754582769080612
train accuracy is: 0.9772158237356035, test accuracy is 0.9737903225806451
Epoch 220 Loss 0.020348892651367842
train accuracy is: 0.9777165748622935, test accuracy is 0.9737903225806451
Epoch 230 Loss 0.01996731932006799
train accuracy is: 0.9782173259889835, test accuracy is 0.9737903225806451
Epoch 240 Loss 0.01959853968485756
train accuracy is: 0.9781338674678685, test accuracy is 0.9737903225806451
0.028509718966256783
#LR model initalization
learner2 = Learner(loss, model_logistic, opt, config.num_epochs)
acc2 = ClfCallback(learner2, config.bs, training_xdata , testing_xdata, training_ydata, testing_ydata)
learner2.set_callbacks([acc2])
#learner call
learner2.train_loop(dl)
Epoch 0 Loss 0.2635071287509881
train accuracy is: 0.5810382240026707, test accuracy is 0.5372983870967742
Epoch 10 Loss 0.10514060051205106
train accuracy is: 0.9100317142380236, test accuracy is 0.9133064516129032
Epoch 20 Loss 0.07995434244529664
train accuracy is: 0.9270572525454849, test accuracy is 0.9349798387096774
Epoch 30 Loss 0.0686117715836812
train accuracy is: 0.9368218995159405, test accuracy is 0.9445564516129032
Epoch 40 Loss 0.061855314486321346
train accuracy is: 0.9406609914872308, test accuracy is 0.9506048387096774
Epoch 50 Loss 0.05728031864902006
train accuracy is: 0.9444166249374061, test accuracy is 0.953125
Epoch 60 Loss 0.05393361606509065
train accuracy is: 0.9465865464863963, test accuracy is 0.9536290322580645
Epoch 70 Loss 0.05135813240139899
train accuracy is: 0.9479218828242364, test accuracy is 0.9566532258064516
Epoch 80 Loss 0.04929724135261386
train accuracy is: 0.9503421799365716, test accuracy is 0.9586693548387096
Epoch 90 Loss 0.04760339542371465
train accuracy is: 0.9514271407110666, test accuracy is 0.9606854838709677
Epoch 100 Loss 0.04617781687185664
train accuracy is: 0.9525121014855616, test accuracy is 0.9621975806451613
Epoch 110 Loss 0.04495648773827633
train accuracy is: 0.9532632281755967, test accuracy is 0.9627016129032258
Epoch 120 Loss 0.04389470594328971
train accuracy is: 0.9536805207811717, test accuracy is 0.9637096774193549
Epoch 130 Loss 0.04296016991172947
train accuracy is: 0.9545151059923218, test accuracy is 0.9647177419354839
Epoch 140 Loss 0.04212959799505519
train accuracy is: 0.9553496912034719, test accuracy is 0.9642137096774194
Epoch 150 Loss 0.04138366377287864
train accuracy is: 0.9559339008512769, test accuracy is 0.9642137096774194
Epoch 160 Loss 0.0407100558707365
train accuracy is: 0.957018861625772, test accuracy is 0.9642137096774194
Epoch 170 Loss 0.04009730731392845
train accuracy is: 0.957352695710232, test accuracy is 0.9652217741935484
Epoch 180 Loss 0.03953559048300741
train accuracy is: 0.957603071273577, test accuracy is 0.9657258064516129
Epoch 190 Loss 0.0390201983233393
train accuracy is: 0.958103822400267, test accuracy is 0.9662298387096774
Epoch 200 Loss 0.03854244600549152
train accuracy is: 0.9586880320480721, test accuracy is 0.9657258064516129
Epoch 210 Loss 0.03810000382484748
train accuracy is: 0.9590218661325322, test accuracy is 0.9667338709677419
Epoch 220 Loss 0.03768685027812864
train accuracy is: 0.9592722416958771, test accuracy is 0.9667338709677419
Epoch 230 Loss 0.037302405439647114
train accuracy is: 0.9596060757803372, test accuracy is 0.9667338709677419
Epoch 240 Loss 0.03694092276161223
train accuracy is: 0.9598564513436821, test accuracy is 0.9667338709677419
0.02761287046103677
#Comparative Stats
plt.figure(figsize = (8,5))
plt.plot(acc1.val_accuracies, 'g-', label = "Neural_Network - Val_Accuracy")
plt.plot(acc1.accuracies, 'b-', label = "Neural_Network - Accuracies")
plt.plot(acc2.val_accuracies, 'r-', label = "Logistic_Regr - Val_Accuracies")
plt.plot(acc2.accuracies, 'y-', label = "Logistic_Regr - Accuracies")
plt.ylim(0.85,1) #check
plt.legend()
<matplotlib.legend.Legend at 0x1dd67022148>
#2D Output
model_new = Model(layers[:-2])
plot_testing = model_new(testing_xdata)
plt.figure(figsize=(8,7)) #blows up
plt.scatter(plot_testing[:,0], plot_testing[:,1], alpha = 0.1, c = y_test.ravel());
#isolating affine -> sigmoid
model_prob = Model(layers[-2:])
xgrid = np.linspace(-4.5, 2, 100)
ygrid = np.linspace(-8, 8, 100)

xg, yg = np.meshgrid(xgrid, ygrid)

xg_interim = np.ravel(xg) #cant do; Check later
yg_interim = np.ravel(yg)

X_interim = np.vstack((xg_interim, yg_interim)) ## Please note vstack takes in a tuple
X = X_interim.T

probability_contour = model_prob(X).reshape(100,100)
plt.figure(figsize=(8,7))
plt.scatter(plot_testing[:,0], plot_testing[:,1], alpha = 0.1, c = y_test.ravel())
contours = plt.contour(xg,yg,probability_contour)
plt.clabel(contours, inline=True);

Inference:

From the graph we see that the neural network has better accuracy over the logistic regression learning model. Also, we can see hints of overfitting in the model, as the validation accruracy and test accuracy cross each other and then diverge