Transfer learning (TL) is a technique in

machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...

(ML) in which knowledge learned from a task is re-used in order to boost performance on a related task. For example, for image classification, knowledge gained while learning to recognize cars could be applied when trying to recognize trucks. This topic is related to the psychological literature on transfer of learning, although practical ties between the two fields are limited. Reusing/transferring information from previously learned tasks to new tasks has the potential to significantly improve learning efficiency. Since transfer learning makes use of training with multiple objective functions it is related to cost-sensitive machine learning and

multi-objective optimization Multi-objective optimization or Pareto optimization (also known as multi-objective programming, vector optimization, multicriteria optimization, or multiattribute optimization) is an area of MCDM, multiple-criteria decision making that is concerned ...

History

In 1976, Bozinovski and Fulgosi published a paper addressing transfer learning in

neural network A neural network is a group of interconnected units called neurons that send signals to one another. Neurons can be either biological cells or signal pathways. While individual neurons are simple, many of them together in a network can perfor ...

training. The paper gives a mathematical and geometrical model of the topic. In 1981, a report considered the application of transfer learning to a dataset of images representing letters of computer terminals, experimentally demonstrating positive and negative transfer learning. In 1992, Lorien Pratt formulated the discriminability-based transfer (DBT) algorithm. By 1998, the field had advanced to include multi-task learning, along with more formal theoretical foundations. Influential publications on transfer learning include the book ''Learning to Learn'' in 1998, a 2009 survey and a 2019 survey. Ng said in his NIPS 2016 tutorial that TL would become the next driver of

commercial success after

supervised learning In machine learning, supervised learning (SL) is a paradigm where a Statistical model, model is trained using input objects (e.g. a vector of predictor variables) and desired output values (also known as a ''supervisory signal''), which are often ...

. In the 2020 paper, "Rethinking Pre-Training and self-training", Zoph et al. reported that pre-training can hurt accuracy, and advocate self-training instead.

Definition

The definition of transfer learning is given in terms of domains and tasks. A domain

\mathcal

consists of: a

feature space Feature may refer to: Computing * Feature recognition, could be a hole, pocket, or notch * Feature (computer vision), could be an edge, corner or blob * Feature (machine learning), in statistics: individual measurable properties of the phenom ...

\mathcal

and a marginal probability distribution

P(X)

, where

X = \  \in \mathcal

. Given a specific domain,

\mathcal = \

, a task consists of two components: a label space

\mathcal

and an objective predictive function

f:\mathcal \rightarrow \mathcal

. The function

f

is used to predict the corresponding label

f(x)

of a new instance

x

. This task, denoted by

\mathcal = \

, is learned from the training data consisting of pairs

\

, where

x_i \in \mathcal

and

y_i \in \mathcal

. Material was copied from this source, which is available under
Creative Commons Attribution 4.0 International License
Given a source domain

\mathcal_S

and learning task

\mathcal_S

, a target domain

\mathcal_T

and learning task

\mathcal_T

, where

\mathcal_S \neq \mathcal_T

, or

\mathcal_S \neq \mathcal_T

, transfer learning aims to help improve the learning of the target predictive function

f_T (\cdot)

\mathcal_T

using the knowledge in

\mathcal_S

and

\mathcal_S

Applications

Algorithms are available for transfer learning in

Markov logic network A Markov logic network (MLN) is a probabilistic logic which applies the ideas of a Markov network to first-order logic, defining probability distributions on possible worlds on any given domain. History In 2002, Ben Taskar, Pieter Abbeel and ...

s and

Bayesian networks A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their Conditional dependence, conditional dependencies via a directed a ...

. Transfer learning has been applied to cancer subtype discovery,Hajiramezanali, E. & Dadaneh, S. Z. & Karbalayghareh, A. & Zhou, Z. & Qian, X. Bayesian multi-domain learning for cancer subtype discovery from next-generation sequencing count data. 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada. building utilization, general game playing, text classification, digit recognition, medical imaging and spam filtering. In 2020, it was discovered that, due to their similar physical natures, transfer learning is possible between electromyographic (EMG) signals from the muscles and classifying the behaviors of electroencephalographic (EEG) brainwaves, from the

gesture recognition Gesture recognition is an area of research and development in computer science and language technology concerned with the recognition and interpretation of human gestures. A subdiscipline of computer vision, it employs mathematical algorithms to ...

domain to the mental state recognition domain. It was noted that this relationship worked in both directions, showing that electroencephalographic can likewise be used to classify EMG. The experiments noted that the accuracy of

neural networks A neural network is a group of interconnected units called neurons that send signals to one another. Neurons can be either Cell (biology), biological cells or signal pathways. While individual neurons are simple, many of them together in a netwo ...

and

convolutional neural network A convolutional neural network (CNN) is a type of feedforward neural network that learns features via filter (or kernel) optimization. This type of deep learning network has been applied to process and make predictions from many different ty ...

s were improved through transfer learning both prior to any learning (compared to standard random weight distribution) and at the end of the learning process (asymptote). That is, results are improved by exposure to another domain. Moreover, the end-user of a pre-trained model can change the structure of fully-connected layers to improve performance.

References

Sources

* {{cite book, url={{google books, plainurl=y, id=X_jpBwAAQBAJ, title=Learning to Learn, last1=Thrun, first1=Sebastian, last2=Pratt, first2=Lorien, date=6 December 2012, publisher=Springer Science & Business Media, isbn=978-1-4615-5529-2 Machine learning

History

Definition

Applications

See also

References

Sources