Types of pre-trained model

There are three well-known and used pre-trained model, they are

  • VGG-16
  • ResNet
  • Inceptive V3

1. VGG-16: | Pre-Trained Model

VGG16 could be a convolutional neural network model planned by K. Simonyan and A. Zisserman from the University of Oxford within the paper “Very Deep Convolutional Networks for Large-Scale Image Recognition.” The Pre-Trained Model achieves ninety-two. 7 % top-5 check accuracy in ImageNet, a dataset of over fourteen million pictures happiness to one thousand categories. It had been one in all the fame models submit to ILSVRC-2014.

It improves AlexNet by replacing giant kernel-sized filters (11 and five within the 1st and second convolutional layer, respectively) with multiple 3×3 kernel-sized filters one once another. VGG16 was trained for weeks and was exploitation NVIDIA Titan Black GPUs.

Structure

The input to the network could be a dimensional image (224, 224, 3). the primary two layers have sixty-four channels with a convolution layer of 3*3 and the same cushioning. once a goop pool layer of section (2, 2), two layers with a convolution layer of 256 filter size and each filter size (3, 3). this is often in the midst of a max-pooling layer of steps (2, 2) that is the same because of the previous layer. And there are two layers of filter size (3, 3) and filter size (256). After that, there are two sets of three convolution layers and a goop pool sheet.

Each of them has 512 filters of (3, 3) size with identical cushioning. This image is then enrapture to the stack of 2 convolution layers. In these convolution and max-pooling layers, the filters we use are 3*3 rather than 11*11 in AlexNet and 7*7 in ZF-Net. In specific layers, one pel is usually accustom manipulate the number of input channels. The 1-pixel cushioning (same padding) is complete once every convolution layer avoids the picture’s spatial options. ( Rezaee, M. et al., 2018)

2. Inceptionv3 | Pre-Trained Model

Inceptionv3 could be a complicated neural network to help image recognition and object identification and has been introduced as a Googlenet module. This is often the third version of Google’s Convolutional Neural Network, at the start disclosed throughout the ImageNet Recognition Competition. Like ImageNet may be thought of as a catalog of classified visual objects, origin helps to spot objects within the field of laptop vision.

Structure:

One such usage is in life sciences, wherever it assists within the study of malignant neoplastic disease. the initial name (Inception) was codename this fashion once the famed “we ought to go deeper’ on the net meme’ went infective agent, referencing a term from Christopher Nolan’s origin Video. (Liu, Z. et al., 2020)

There are many typos in origin-v3 that contribute to inaccurate explanations of the Inception models. These could are thanks to the great ILSVRC contention at the time. As a consequence, there are many reviews on the net that blend each v2 and v3. Any reviewers additionally suppose that v2 and v3 are identical with merely several different minor settings.

3. ResNet:

ResNet took the deep learning world by storm in 2015 because the 1st neural network might train a whole lot or thousands of layers while not succumbing to the “disappearing gradient” downside. Keras makes it straightforward to make ResNet models: you’ll be able to run pre-train ResNet variants on ImageNet with only 1 line of code or build your custom ResNet implementation. You’ll speed up the method with MissingLink’s deep learning tool, which automates the preparation, delivery, and watching of ResNet.

Mostly to resolve a fancy downside, we tend to stack some extra layers in Deep Neural Networks that improve accuracy and potency. The intuition behind incorporating many layers is that these layers are increasingly learning more complicated functions. E.g., if pictures are detected, the primary layer will learn to sight edges, the second layer will learn to acknowledge textures, and therefore the third layer will learn to sight artifacts, and so on.

However, it’s been found that there’s a most threshold for depth with the average Convolutional neural network model. However, there’s the most depth threshold for the standard Convolutional neural network model.

Structure:

ResNet’s skip connections solve the matter of disappearing gradients in deep neural networks by permitting the gradient to follow this various crosscut direction. the opposite facet that these relations profit is by encouraging the Pre-Trained Model to find out the identity functions that guarantee that the upper layer performs a minimum of and, therefore, the lower layer, and not worse. (Chen, Z. et al., 2017)

Say we’ve got an external network and a deep network that maps an input ‘x’ to output ‘y’ exploitation H(x). We want the deep network to perform a minimum because of the external network and not degrade the performance as we saw in specific neural networks(without residual blocks). Therefore, a way of achieving is that if the extra layers in an exceedingly deep network learn the identity perform. Therefore their output equals inputs, which doesn’t permit them to degrade the performance even with further layers.

It has been shown that residual blocks build it improbably potential for layers to find out identity functions. It’s clear from the following formulas. Production is in straightforward networks.

(x)=f (x), 

To find out AN identity perform, f(x) should be capable x, that is that the critic to be achieved, whereas within the case of ResNet, that has the output:

H(x)=f(x)+x; 

(x)=0 

H(x)=x=x 

We want to create f(x)=0 less complicated, and we’ll get x as output, which is also our input.

In the best-case scenario, extra layers of the deep neural network can higher estimate the mapping of ‘x’ to ‘y’ output than the shallower equivalent and scale back the error by a considerable quantity. And thus, we tend to expect ResNet to do similar or higher than straightforward deep neural networks. Exploitation ResNet has dramatically exaggerate the potency of neural networks with many layers, and here is the percent error plot compare to neural networks with superficial layers.

References:

Rezaee, M., Zhang, Y., Mishra, R., Tong, F. and Tong, H., 2018, August. Using a vgg-16 network for individual tree species detection with an object-based approach. In 2018 10th IAPR Workshop on Pattern Recognition in Remote Sensing (PRRS) (pp. 1-7). IEEE.

Chen, Z., Xie, Z., Zhang, W. and Xu, X., 2017, August. ResNet and Model Fusion for Automatic Spoofing Detection. In INTERSPEECH (pp. 102-106). 

Liu, Z., Yang, C., Huang, J., Liu, S., Zhuo, Y. and Lu, X., 2020. Deep learning framework based on integration of S-Mask R-CNN and Inception-v3 for ultrasound image-aided diagnosis of prostate cancer. Future Generation Computer Systems, 114, pp.3

Written By: Saarika R Nair

Reviewed By: Rushikesh Lavate

If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other Technical And Non Technical Internship Programs

Leave a Comment

Your email address will not be published. Required fields are marked *