In this blog-post I would like to give you a clear blue-print on how you can build your own CNN from scratch. Isn’t that sounds interesting. Okay then, what we are waiting for, Let’s get right into it.

Convolutional neural networks(CNNs) a.k.a ConvNets are widely used Deep Neural Networks(DNN) architectures for Image Classification. CNNs have become the gold standard for image classification. Ever since Alex Krizhevsky, Geoff Hinton, and Ilya Sutskever won ImageNet in 2012.

I personally love keras, which is a deep learning library in python implemented by François Chollet, for building our CNN models.

What do I need to Implement a classification model?

  • input dataset
  • a classification model

That’s it, if you have a dataset with suficient number of samples, and their corresponding labels. You can create a classification model (CNN here), then you can test unseen or new input samples on the model to see how it is performing.

Okay but where do I get dataset?

As always you can create one on your own or you can download from various sources. But I promised to make it simple, my aim is to make you to test yourself, and get the feel of it. So, you can get the dataset from your python code itself, you don’t need to worry to search, download, make some pre-processing to feed the model. Keras will provide some of the famous datasets to get started. Such as MNIST, CIFAR-10, imdb dataset, boston-housing dataset, etc. For instance, we consider MNIST, which is a handwritten digits database form 0 – 9,  has a training set of 60,000 examples, and a test set of 10,000 examples with  10 different classes. We chose it because it is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.

loading the dataset is very simple, which also splits the entire dataset into training and testing sets, the training data set is used to train our model and the test set is used to test its performance.

This following one line of python code will do the job:

(x_train, y_train), (x_test, y_test) = mnist.load_data()

figure_1

Okay, I have data how about a model?

In order to create a model you need few python packages to be installed on your machine.

Here’s how you can install the requirements. how to install packages

numpy==1.11.1
pandas==0.18.1
h5py==2.6.0
matplotlib==1.5.1
Pillow==4.1.1
cairocffi==0.8.0
editdistance==0.3.1
keras==2.0.4
scipy==0.19.0
six==1.10.0
scikit_learn==0.18.1
theano==0.9.0

Once you have all the packages installed, you can start creating the model by importing  required modules and function.

import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

It’s time to create a model now.

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',
 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

The network topology can be summarized as follows.

  1. The first hidden layer is a convolutional layer called a Convolution2D. The layer has 32 feature maps, which with the filter size of 3×3 and a rectifier activation function. This is the input layer, expecting images with the structure outline above [img_rows], [img_cols], [channels]
  2. Next another Convolutional layer with 64 feature maps, which with the filter size of 3×3 and a rectifier activation function
  3. Next we define a pooling layer that takes the max called MaxPooling2D. It is configured with a pool size of 2×2.
  4. The next layer is a regularization layer using dropout called Dropout. It is configured to randomly exclude 25% of neurons in the layer in order to reduce overfitting.
  5. Next is a layer that converts the 2D matrix data to a vector called Flatten. It allows the output to be processed by standard fully connected layers.
  6. Next a fully connected layer with 128 neurons and rectifier activation function.
  7. Next is another Dropout layer which is configured to randomly exclude 25% of neurons in the layer in order to reduce overfitting.
  8. Finally, the output layer has 10 neurons for the 10 classes and a softmax activation function to output probability-like predictions for each class.

The below figure is not exact replica of the model we created but it is a simple example to illustrate how the network architecture looks like.

cc07060cd9cb6996ed82d687566b7c11

The created model is now to be compiled and trained on the training data.  As we are dealing with multi-class problem, we use categorical cross entropy. you can try other optimizer of your choice as well, and we are interested in looking at accuracy as our metric.

model.compile(loss=keras.losses.categorical_crossentropy,
 optimizer=keras.optimizers.Adadelta(),
 metrics=['accuracy'])

model.fit(x_train, y_train,
 batch_size=128,
 epochs=12,
 verbose=1,
 validation_data=(x_test, y_test))

score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

The following is the screenshot of my code results. I got a test accuracy of 98%, which is pretty good yet we can improve further.

mnist_cnn_output

Exercise: Try modifying the model, and share your findings here back,  I’ll be curious to see what did you have tried.

Download Code here

 

Credits: François Chollet, creator of Keras.

Advertisements

2 thoughts on “Your First Convolutional Neural Network(CNN)

  1. Good information shravan.but I am confused with different layers u are using.
    What neural networks provide us and how the convoluted neural networks differfrom the normal ones.?

    Liked by 1 person

  2. Hey Kushal, first of all thank you. In general neural networks have the ability to learn any pattern, with only one hidden layer, if you increase the number of hidden layers in between input layer and output layer, it is call a deep network. In this blog post we employed 2D-convolutions, which can convolve a two-dimentional input(in our case mnist digit images of size 28×28). Hope that answered part of your question, and I didn’t included any content regarding the layers, because to avoid verbosity. But I consider this for my next blog-post.

    Thanks

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s