What Is Transfer Learning? Definition, Methods, and Applications

Transfer learning is a machine learning approach of reusing a previously trained model to tackle a new problem.

December 21, 2022

Transfer learning is defined as a machine learning (ML) method that reuses a trained model designed for a particular task to accomplish a different but related task. This article explains transfer learning, its key methods, and applications.

What Is Transfer Learning?

Transfer learning is a machine learning (ML) method that reuses a trained model designed for a particular task to accomplish a different yet related task. The knowledge acquired from task one is thereby transferred to the second model that focuses on the new task.

The term ‘transfer learning’ is related to human psychology. For example, consider an individual who is an expert guitarist. It is quite easy for him to learn to play other stringed instruments, such as a sitar or mandolin, compared to someone with no experience playing any musical instrument.

Similarly, consider a simple classifier trained to predict whether an image contains a hand purse. In this case, you can use the knowledge gained by that model to recognize other objects, such as wallets, sunglasses, etc., in subsequent tasks.

Transfer learning captures the lessons learned in one task and applies them to fine-tune another task. Technically, the weights an ML model arrests as it accomplishes ’problem X’ are transferred to a new ’problem Y’.

The idea here is to reprocess the information gained from task_1, which has labeled training data, to complete task_2, which has less data or labels than task_1. With transfer learning, the learning process can begin from patterns captured while addressing similar tasks rather than beginning from ground zero.

Transfer learning is typically employed in computer vision (CV) and natural language processing (NLP) tasks. Both CV and NLP require large datasets and high computational power. Let’s consider a CV task where you train a machine learning model to classify MRI images. You can retrain the same model to recognize images with other diseases, such as traumatic brain injuries or brain tumors. Thus, transfer learning helps achieve faster yet accurate results.

Similarly, for NLP tasks, if a model has been trained to perform the sentiment analysis of Russian texts, you can easily build a model that performs the same task for Korean or Japanese texts.

Significance of transfer learning

Transfer learning speeds up the overall process of training a new model and consequently improves its performance. It is primarily used when a model requires large amount of resources and time for training. Due to these reasons, transfer learning is employed in several deep learning projects, such as neural networks that accomplish NLP or CV tasks, such as sentiment analysis.

Here, it is worth noting that transfer learning does not explicitly refer to machine learning. It is more inclined toward a design methodology, much like active learning. It does not encompass a study in particular but refers to a technique that tackles problems related to concept drift or multi-task learning. In ML, concept drift refers to the changes that the statistical properties of a task undergo over a period, which the model tries to predict. As a consequence, the prediction accuracy of the model takes a hit. At this point, transfer learning can prove to be handy as it relies on vast amounts of data and information to make appropriate predictions.

Moreover, when the available training data is insufficient, transfer learning plays a vital role as it uses the weights captured from the first model to initialize the weights of the second model. Transfer learning relies on feature generalization as they are transferred from one task to another. This implies that datasets are crucial here. It is observed that transfer learning can yield optimized results when the dataset used in the second training is similar to the one used in the first training.

Additionally, transfer learning encapsulates all the elements necessary to eventually lead to the artificial general intelligence (AGI) paradigm in the future.

See More: What Is Parallel Processing? Definition, Types, and Examples

Transfer Learning Methods

Transfer learning can accomplish tasks by undertaking different approaches. Let’s look at each approach in detail.

1. Train ’similar domain’ models

This transfer learning technique trains models that belong to similar domains. Let’s say you need to complete task X but lack sufficient data. However, you note that task Y is similar to task X and has enough data to complete task Y. In such a scenario, you can train a model on task Y and then use the successful model to develop a new model to work on task X.

2. Extract features

Feature extraction is another key approach to transfer learning, wherein deep neural networks are trained to serve as automatic feature extractors. After running them over the pre-trained models, the representations are exported to newer models. This method is extensively used by data science professionals.

3. Use pre-trained models

This is another approach that relies on developing pre-trained models considering transfer learning variables. Enterprises experienced in model development tend to have access to a library of models that can be used to design future models. This implies that while addressing a newer problem, a pre-trained model can be chosen, optimized with the problem at hand, and then re-used to train another model.

Some examples of pre-trained machine learning models include Google’s Inception model, Microsoft’s MicrosoftML R package and Microsoftml Python package. The inception model is trained on the ImageNet dataset and used to classify images into 1,000 and more classes. Other popular models available in the market include AlexNet, Oxford’s VGG Model, and Microsoft’s ResNet.

Moreover, some of the known pre-trained models that address NLP-related data problems include Google’s word2vec Model and Stanford’s GloVe Model.

Transfer learning process

Irrespective of the methods discussed above, the transfer learning process follows certain key steps for its realization. Let’s understand them in detail.

  • Access pre-trained models: Organizations can obtain pre-trained models from their own collection of model libraries or other open-source repositories. For example, PyTorch Hub is an open-source pre-trained model repository designed to speed up the research path, from prototyping to product deployment. Similarly, TensorFlow Hub is an open repository and reusable ML library with several pre-trained models that can be used for tasks such as text embeddings, image classification, and much more.
  • Freeze layers: A typical neural network reveals three layers: inner (early), middle, and latter layers. In transfer learning, the inner and middle layers are retained as they are, and only the latter layers are retrained so that the method can use the labeled data of the task that it was previously trained on. As discussed in one of the previous examples, the classifier trained to recognize the hand purse can be used to detect wallets or sunglasses. Since it is evident here that the earlier layers have learned to recognize objects, we only need to retrain the latter layers to identify wallets or other objects.

The freezing of layers is essential as it avoids the re-initialization of the weights in the model. The re-initialization step can cause the model to lose all its previous learnings.

  • Train new layers: Upon freezing the requisite layers, new layers must be added to the model to make new predictions on the latest dataset.
  • Fine-tune the model: Fine-tuning the base model isn’t necessary; however, it can improve the overall model performance. The process includes unfreezing some layers of the model and then retraining it at a low-learning rate to handle a new dataset.

See More: What Is Spatial Analysis? Definition, Working, and Examples

Applications of Transfer Learning

Transfer learning is an emerging technology that finds applications in varied fields of machine learning. It is already showcasing real-world usages.

Let’s dive into the application areas of transfer learning:

1. Natural language processing (NLP)

Natural language processing refers to a system capable of comprehending and analyzing human language in audio or text files. The primary objective of NLP is to improve the quality of interaction between humans and machines. Day-to-day services such as voice assistants, speech recognition software, translations, and so on rely on NLP.

Transfer learning strengthens ML models that handle NLP tasks. For example, transfer learning can be employed to train models simultaneously for detecting various language elements, specific dialects, phrases, or vocabularies.

Moreover, transfer learning enables models to adapt to multiple languages. This implies that the models trained for the English language can be retrained and adapted to other similar languages or tasks. The knowledge of pre-trained models with the ability to recognize linguistic syntaxes can be transferred to other models that can predict the next word or phrase, considering the structure of previous sentences.

For example, Google provides a ’Neural Translation model (GNMT)’ that is capable of cross-lingual translations. The model uses a pivot or common language between two discrete languages to accomplish the translation task. Let’s say you intend to translate Russian to Korean. In this case, you must first transfer Russian to English and then English to Korean. At its core, the technique uses the data to learn the translation mechanism to better translate between a pair of languages.

2. Computer vision (CV)

Computer vision enables systems to derive meaning from visual data fed through images or videos. ML algorithms train on large datasets (images) and refine themselves to be able to recognize images or classify objects within the images. In such cases, transfer learning comes to the fore as it takes control of the reusable aspects of a CV algorithm and runs it on a newer model.

Transfer learning can use models produced from large training datasets and apply them to smaller image sets. This can include determining the sharp edges of objects in the provided collection of images. Moreover, the layers that specifically identify edges in images can be determined and then trained based on the need.

3. Neural networks

Neural networks are key to deep learning as they are designed to simulate and replicate human brain functions. Training neural networks require a heavy load of resources due to the complexity of the models they tend to provide. Thus, transfer learning can be used here to reduce the resource demand and, at the same time, make the entire process more efficient.

Several transferrable features are moved from one network to another to fine-tune the model development process. Knowledge application across tasks is of paramount importance in building neural nets.

Let’s look at the different industries where these ML-based transfer learning techniques are employed.

1. Autonomous driving industry

With transfer learning, a model trained for autonomous bus driving can be retrained and reused for autonomous car driving. Additionally, transfer learning can be implemented in autonomous vehicles to detect road signs, objects, cars, trucks, or buses on roads.

2. Gaming industry

In the gaming arena, the model developed for one game, such as DeepMind’s AlphaGo, can be applied to other games, such as chess. The knowledge transfer between gaming models is important as it cuts down on time needed to build new models for other games.

Another example is MadRTS, a commercial real-time strategy game designed to work out military simulations. MadRTS relies on CARL (case-based reinforcement learning) that bridges gaps between case-based reasoning and reinforcement learning (RL). Case-based reasoning handles unseen but related problems using past experiences, while RL algorithms enable the model to work out approximations of different situations based on the agent’s expertise.

The two modules, in this case, use transfer learning to enhance the overall gaming experience for MadRTS players.

3. Healthcare sector

Electromyographic (EMG) signals that evaluate muscle response bear some similarity to electroencephalographic (EEG) brainwaves. As a result, both EMG and EEG signals can use transfer learning to accomplish tasks such as gesture recognition.

Moreover, medical imaging is another area where transfer learning can be sufficiently exploited. For example, MRI scans can train models that specifically detect brain tumors from brain-scanned images.

4. Emails

Transfer learning can be exploited for an AI model trained to categorize emails to filter out spam (spam filtering).

5. Ecommerce industry

In the ecommerce space, monitoring and tracking customer behavior is crucial for generating more sales. With transfer learning, organizations now have the capability to zero in on subjective customer experiences based on sentiment analysis. This covers the likes, dislikes, interests, views, and preferences of customers for particular products or services. It allows businesses to dig deeper into feedback and reviews to better understand users’ emotions toward the offered services.

Today, ecommerce companies rely on automated tools that help in sentiment classification. The process converts user opinions into texts, which are further classified into broad categories of positive, negative, or neutral. Upon analysis, enterprises can outline customized plans for customers to enhance their overall shopping experience. Sentiment classification is extensively used by various businesses through their social media handles, wherein they mine, process, and extract social media conversations and opinions to better comprehend user sentiments.

See More: What Is Data Analytics? Definition, Types, and Applications

Takeaway

Transfer learning can effectively reduce data processing requirements and model development time. However, for it to succeed, the algorithms, pre-trained models, and datasets should be easy to access and well-documented so that model-driven organizations can benefit from it.

In summary, transfer learning has only streamlined and enhanced ML capabilities. It has made model processing faster. As a result, the capital investment required to start processes from scratch has reduced, along with a significant cut down on time spent on model building. Due to these advantages, several businesses across industry verticals are keen on applying transfer learning, which not only boosts their performance but also saves them time and money. It would be exciting to witness how businesses embrace transfer learning as an ML capability to stay relevant in the market in the future.

Did this article help you understand how transfer learning works? Comment below or let us know on FacebookOpens a new window , TwitterOpens a new window , or LinkedInOpens a new window . We’d love to hear from you!

MORE ON ARTIFICIAL INTELLIGENCE

Vijay Kanade
Vijay A. Kanade is a computer science graduate with 7+ years of corporate experience in Intellectual Property Research. He is an academician with research interest in multiple research domains. His research work spans from Computer Science, AI, Bio-inspired Algorithms to Neuroscience, Biophysics, Biology, Biochemistry, Theoretical Physics, Electronics, Telecommunication, Bioacoustics, Wireless Technology, Biomedicine, etc. He has published about 30+ research papers in Springer, ACM, IEEE & many other Scopus indexed International Journals & Conferences. Through his research work, he has represented India at top Universities like Massachusetts Institute of Technology (Cambridge, USA), University of California (Santa Barbara, California), National University of Singapore (Singapore), Cambridge University (Cambridge, UK). In addition to this, he is currently serving as an 'IEEE Reviewer' for the IEEE Internet of Things (IoT) Journal.
Take me to Community
Do you still have questions? Head over to the Spiceworks Community to find answers.