In this tutorial, we will discuss the basics of CNN and by the end of this tutorial, you will be able to design it for your project. and for this tutorial, we will also build a project on cat and dog classifier, so let’s dive in.
What is Convolutional Neural Network and Its need?
CNN is a deep learning architecture, uses the concept of convolutions for feature learning. Although basic neural nets used for feature learning. there are some problems with basic neural nets for Computer Vision tasks.
In Computer vision tasks, we deal with Images. suppose any Image with 500x500x3 pixels. then we give pretty large input. so our learnable parameters will be significantly increased.
it reduces learning. But it is not a single problem alone. there are some other problems also.
suppose we use the basic neural nets but how can we learn some feature which has some importance of location like the node for cats. and paws for dogs but we can’t do it in basic neural nets because all the neurons in any layer are independent.
How CNN works?
Convolution is the 1st layer to extract features from an input image. The connection between pixels preserved – learning image features using small squares of input data. image matrix and a filter or kernel – taken as two inputs.
Consider 5×5 Image whose pixel values are 0, 1 and filter matrix 3×3 as shown in below image
Then the convolution of five x five image matrix multiplies with three x three filter matrix.
“Feature Map” as output is shown below:
Convolution of a picture with totally different filters will perform operations like edge detection, blur and sharpen by applying filters. The below example shows numerous convolution images once applying different types of filters (Kernels).
Basic Operations with CNN –
Strides
Stride is that the variety of pixels shifts over the input matrix. Once the stride is one then we tend to move the filters to one component at a time. Once the stride is two then we tend to move the filters to two pixels at a time so on. The figure below represents that convolution would work with a stride of two.
Padding
Sometimes filters don’t utterly match the input image. we’ve got 2 options:
- Pad the picture with zeros (0-padding) so that it fits
- Drop the part of the image where the filter did not fit.
- valid padding which keeps only the valid part of the image.
Non Linearity (ReLU)
ReLU stands for corrected linear measure for a non-linear operation. The output is
ƒ(x) = max(0,x).
Why ReLU is vital: ReLU’s purpose is to introduce non-linearity in our ConvNet. Since the important world information needs our ConvNet told would be non-negative linear values.
Pooling Layer
The pooling layers section would scale back the number of parameters once the pictures square measure large. spatial pooling knew as subsampling or downsampling that reduces the spatial property of every map however retains vital data.
spatial pooling is often of various types:
- Max Pooling
- Average Pooling
- Sum Pooling
The largest element from the rectified feature map – max pooling. Taking the largest element also takes the average pooling.
sum pooling – the Sum of all elements in the feature map call.
Cat-Dog Classifier using CNN
Till now, we have covered all the basics of convolutional neural nets and some basic operations to perform on it. Now we will use TensorFlow to implement all of these for the cat-dog classifier, so let’s dive in
Step 1) Import All Libraries.
Step 2) Define Model Architecture, we will use a Sequential API from Keras and then add simply Conv2D and MaxPooling Layer as we discussed earlier.
Point 3) Now Let’s write some codes to plot Accuracy and Loss.
Step 4) In this step we will train our model and we will use ImageDataGenerator from tensorflow to generate batches of Images.
Step 5) Let’s make Prediction on your Inputs.
To make a prediction on new images We can use our saved model .
The assumption of the model is that new images are color and they have been segmented in such a way that one image contains at least one dog or cat.
The figure Below shows an image being extracted from the test dataset for the dogs and cats competition.We can clearly tell it is a photo of a dog without a label. .
You can save it in your current working directory with the filename ‘sample_image.jpg‘.
We will faux this is often a completely new and unseen image, ready within the needed approach, and see however we’d use our saved model to predict the whole number that the image represents. For this instance, we have a tendency to expect category “1” for “Dog“.
Note:
the subdirectories of pictures, one for every category, are loaded by the flow_from_directory() operate in alphabetical order associated with an appointed whole number for every category. The directory “cat” comes before “dog“, so the category labels are appointed the integers: cat=0, dog=1. this may be modified via the “classes” argument in occupation flow_from_directory() once coaching the model.
First, we will load the image and force it to the scale to be 224×224 pixels. The loaded image will then be resized to possess one sample in an exceedingly dataset. The constituent values should even be targeted to match the approach that the information was ready throughout the coaching of the model. The load_image() operator implements this and can come the loaded image prepared for classification.
Next, we can load the model as in the previous section and call the predict() function to predict the content in the image as a number between “0” and “1” for “cat” and “dog” respectively.
Running the example first loads and prepares the image, loads the model, and then correctly predicts that the loaded image represents a ‘dog‘ or class ‘1‘.
Conclusion
That’s it, You have learned the CNN and how it works and also you have completed the cat-dog classifier, Now finally you are able to implement this CNN in some other project also. Keep Learning!
Article by: Gajesh Ladhar
If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other Technical And Non Technical Internship Programs