Transfer Learning is a machine learning technique in which a division of a model that has been trained to perform a particular task is reused to perform a similar task. This technique is usually used when there is not sufficient data to carry out a specific task. Using pre-trained models can exceptionally reduce the training time for the new task.
A Neural Network is first train on a base dataset for a given task, then the weights from this model are repurpose in a new Neural Network which is to be train for the new task. It becomes more efficient and convenient if the features of the tasks are general. For example, consider training a CNN model to classify images of lions and tigers. The initial layers of this model can be then reuse to build a model for image classification of cats and dogs.
APPLICATIONS | Transfer Learning
- TEXT
In Natural Language Processing, data in the form of text is in use for prediction and analysis tasks. Firstly, the textual data is preprocess using various techniques then an ML or deep learning model is train to perform a specific task. These tasks can include sentiment analysis, named entity recognition, or document classification. Transfer Learning comes into play when a model train upon historical texts can be reuse for data like social media messages and news articles.
- IMAGE
Computer Vision is widely used for tasks like object detection and image classification. Normalization and gray scaling images is followed by training CNN models to achieve desired results. Transfer learning has immense application for Computer Vision projects. Image-Net database is commonly mentioned while discussing transfer learning for images, which we will discuss further in this post.
- AUDIO
Audio analysis works around audio signals to perform tasks like speech recognition and sentiment analysis. It has been in use immensely for audio-based tasks as well.
Transfer Learning
Transfer Learning is using a section of a pre-trained model for a new specific task. Solving complex problems becomes a strenuous problem in the absence of sufficient data, accumulating large amounts of data delays production time. Image-Net is a great example to explain transfer learning. Image-Net is an open-source image repository that consists of approximately 1.2 million images of various categories.
Consider Task A for which a Convolutional Neural Network is made assemble the Image-Net database. The model for Task A consists of two initial layers which might be kernels or feature detectors, we then flatten our feature maps and pass them on to the dense layers. These dense layers are in use specifically for Task A.
Now, say we want to build a new model for Task B. We use transfer learning here, wherein we lift the initial convolutional layers from the Task A model and use it for the Task B model. We then flatten the feature maps and pass them on to the dense layers specific to Task B.
In this process, we are transferring knowledge gain by the model in Task A to Task B, which saves computational time and gives excellent results. Usually, the reason we transfer the initial layers is that these layers extract the high-level general features like edge detection. The final layers work towards discrete tasks.
STRATEGIES
- Instance transfer
Usually, there is a direct transfer of knowledge from Task A to Task B. However, sometimes the two tasks are not entirely similar and it becomes void to transfer knowledge. In such cases, only certain data instances from Task A are in use to train the model for Task B.
- Feature-representation transfer
This approach reduces domain divergence and error rates by distinguishing sensible feature representations that may be utilize from the Task A domain to Task B domain. Relying upon the provision of labeled data supervised or unsupervised methods could be applied for feature-representation-based transfers.
- Parameter transfer
This approach works on the belief that the models for connected tasks share parameters. We could apply extra weightage to the loss of the target domain to boost overall performance.
- Relational-knowledge transfer
The relational-knowledge transfer handles data that is not independent and identically distributed. In this approach, each data point has a relationship with other data points.
Source: towardsdatascience.com/transferlearning
DRAWBACKS
Even though Transfer Learning is a high-performance method that gives exceptional results, it is not without flaw. Certain pitfalls need to be recognize before adopting transfer learning for a problem else instead of progress. it will make things worse. Firstly, it is difficult to use transfer learning if there is a large similarity gap between Task A and Task B.
However, The two tasks need to be similar enough that the model for A can be adopted for B, else the performance drops. This is because if the two tasks are not similar, the initial layers instruct for A which are transfer to model B are instruct on data that might be irrelevant to Task B. This limitation in transfer learning is the negative transfer.
Another drawback that is quite common in machine learning is overfitting. If the data is too small in size, using transfer learning will have a detrimental effect. For the new model, the transferred layers become difficult to manipulate as the weights are frozen from the source task. Hence we need to focus on dedicated dense layers if we want to overcome overfitting which might not always work.
CONCLUSION
The potential for transfer learning is vast, and the scope; wide. Andrew Ng, in 2016, stated transfer learning to be the next driver of machine learning success. After supervised learning, transfer learning will be next up for mainstream adoption. With applications in robotics, NLP, and Computer Vision, transfer learning will most definitely take the centre stage in the impending future.
In conclusion, transfer learning offers novel solutions to complex problems with multiple approaches. Hopefully, this post provides a short introduction to transfer learning. To delve deeper into the foundations of transfer learning, Sebastian Ruder and Dipanjan Sarkar present a comprehensive disquisition on the subject.
Written By: Vishva Desai
Reviewed By: Vikas Bhardwaj
If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other Technical And Non Technical Internship Programs