
Transfer learning (TL) is a technique in
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
(ML) in which knowledge learned from a task is re-used in order to boost performance on a related task. For example, for
image classification, knowledge gained while learning to
recognize cars could be applied when trying to recognize trucks. This topic is related to the psychological literature on
transfer of learning, although practical ties between the two fields are limited. Reusing/transferring information from previously learned tasks to new tasks has the potential to significantly improve learning efficiency.
Since transfer learning makes use of training with multiple objective functions it is related to
cost-sensitive machine learning and
multi-objective optimization
Multi-objective optimization or Pareto optimization (also known as multi-objective programming, vector optimization, multicriteria optimization, or multiattribute optimization) is an area of MCDM, multiple-criteria decision making that is concerned ...
.
History
In 1976, Bozinovski and Fulgosi published a paper addressing transfer learning in
neural network
A neural network is a group of interconnected units called neurons that send signals to one another. Neurons can be either biological cells or signal pathways. While individual neurons are simple, many of them together in a network can perfor ...
training. The paper gives a mathematical and geometrical model of the topic. In 1981, a report considered the application of transfer learning to a dataset of images representing letters of computer terminals, experimentally demonstrating positive and negative transfer learning.
In 1992,
Lorien Pratt formulated the discriminability-based transfer (DBT) algorithm.
By 1998, the field had advanced to include
multi-task learning, along with more formal theoretical foundations. Influential publications on transfer learning include the book ''Learning to Learn'' in 1998, a 2009 survey and a 2019 survey.
Ng said in his NIPS 2016 tutorial that TL would become the next driver of
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
commercial success after
supervised learning
In machine learning, supervised learning (SL) is a paradigm where a Statistical model, model is trained using input objects (e.g. a vector of predictor variables) and desired output values (also known as a ''supervisory signal''), which are often ...
.
In the 2020 paper, "Rethinking Pre-Training and self-training", Zoph et al. reported that pre-training can hurt accuracy, and advocate self-training instead.
Definition
The definition of transfer learning is given in terms of domains and tasks. A domain
consists of: a
feature space
Feature may refer to:
Computing
* Feature recognition, could be a hole, pocket, or notch
* Feature (computer vision), could be an edge, corner or blob
* Feature (machine learning), in statistics: individual measurable properties of the phenom ...
and a
marginal probability distribution , where
. Given a specific domain,
, a task consists of two components: a label space
and an objective predictive function
. The function
is used to predict the corresponding label
of a new instance
. This task, denoted by
, is learned from the training data consisting of pairs
, where
and
.
[ Material was copied from this source, which is available under ]
Creative Commons Attribution 4.0 International License
Given a source domain
and learning task
, a target domain
and learning task
, where
, or
, transfer learning aims to help improve the learning of the target predictive function
in
using the knowledge in
and
.
Applications
Algorithms are available for transfer learning in
Markov logic network A Markov logic network (MLN) is a probabilistic logic which applies the ideas of a Markov network to first-order logic, defining probability distributions on possible worlds on any given domain.
History
In 2002, Ben Taskar, Pieter Abbeel and ...
s and
Bayesian networks
A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their Conditional dependence, conditional dependencies via a directed a ...
. Transfer learning has been applied to cancer subtype discovery,
[Hajiramezanali, E. & Dadaneh, S. Z. & Karbalayghareh, A. & Zhou, Z. & Qian, X. Bayesian multi-domain learning for cancer subtype discovery from next-generation sequencing count data. 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada. ] building utilization,
general game playing,
text classification, digit recognition, medical imaging and
spam filtering.
In 2020, it was discovered that, due to their similar physical natures, transfer learning is possible between
electromyographic (EMG) signals from the muscles and classifying the behaviors of
electroencephalographic (EEG) brainwaves, from the
gesture recognition
Gesture recognition is an area of research and development in computer science and language technology concerned with the recognition and interpretation of human gestures. A subdiscipline of computer vision, it employs mathematical algorithms to ...
domain to the mental state recognition domain. It was noted that this relationship worked in both directions, showing that
electroencephalographic can likewise be used to classify EMG. The experiments noted that the accuracy of
neural networks
A neural network is a group of interconnected units called neurons that send signals to one another. Neurons can be either Cell (biology), biological cells or signal pathways. While individual neurons are simple, many of them together in a netwo ...
and
convolutional neural network
A convolutional neural network (CNN) is a type of feedforward neural network that learns features via filter (or kernel) optimization. This type of deep learning network has been applied to process and make predictions from many different ty ...
s were improved through transfer learning both prior to any learning (compared to standard random weight distribution) and at the end of the learning process (asymptote). That is, results are improved by exposure to another domain. Moreover, the end-user of a pre-trained model can change the structure of fully-connected layers to improve performance.
See also
*
Crossover (genetic algorithm)
Crossover in evolutionary algorithms and evolutionary computation, also called recombination, is a genetic operator used to combine the chromosome (genetic algorithm), genetic information of two parents to generate new offspring. It is one way to ...
*
Domain adaptation
Domain adaptation is a field associated with machine learning and inductive transfer, transfer learning. It addresses the challenge of training a model on one data distribution (the source domain) and applying it to a related but different data ...
*
General game playing
*
Multi-task learning
*
Multitask optimization
*
Transfer of learning in
educational psychology
Educational psychology is the branch of psychology concerned with the scientific study of human learning. The study of learning processes, from both cognitive psychology, cognitive and behavioral psychology, behavioral perspectives, allows researc ...
*
Zero-shot learning
*
Feature learning
*
external validity
External validity is the validity of applying the conclusions of a scientific study outside the context of that study. In other words, it is the extent to which the results of a study can generalize or transport to other situations, people, stimul ...
References
Sources
* {{cite book, url={{google books, plainurl=y, id=X_jpBwAAQBAJ, title=Learning to Learn, last1=Thrun, first1=Sebastian, last2=Pratt, first2=Lorien, date=6 December 2012, publisher=Springer Science & Business Media, isbn=978-1-4615-5529-2
Machine learning