HOME

TheInfoList



OR:

Jürgen Schmidhuber (born 17 January 1963) is a German
computer scientist A computer scientist is a person who is trained in the academic study of computer science. Computer scientists typically work on the theoretical side of computation, as opposed to the hardware side on which computer engineers mainly focus (a ...
most noted for his work in the field of
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech ...
,
deep learning Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised. ...
and artificial neural networks. He is a co-director of the Dalle Molle Institute for Artificial Intelligence Research in
Lugano Lugano (, , ; lmo, label= Ticinese, Lugan ) is a city and municipality in Switzerland, part of the Lugano District in the canton of Ticino. It is the largest city of both Ticino and the Italian-speaking southern Switzerland. Lugano has a populat ...
, in
Ticino Ticino (), sometimes Tessin (), officially the Republic and Canton of Ticino or less formally the Canton of Ticino,, informally ''Canton Ticino'' ; lmo, Canton Tesin ; german: Kanton Tessin ; french: Canton du Tessin ; rm, Chantun dal Tessin . ...
in southern
Switzerland ). Swiss law does not designate a ''capital'' as such, but the federal parliament and government are installed in Bern, while other federal institutions, such as the federal courts, are in other cities (Bellinzona, Lausanne, Luzern, Neuchâtel ...
. Following
Google Scholar Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes ...
, from 2016 to 2021 he has received more than 100,000 scientific citations. He has been referred to as "father of modern AI," "father of AI," "dad of mature AI," "Papa" of famous AI products, "Godfather," and "father of deep learning." (Schmidhuber himself, however, has called Alexey Grigorevich Ivakhnenko the "father of deep learning.") Schmidhuber completed his undergraduate (1987) and PhD (1991) studies at the Technical University of Munich in
Munich Munich ( ; german: München ; bar, Minga ) is the capital and most populous city of the German state of Bavaria. With a population of 1,558,395 inhabitants as of 31 July 2020, it is the third-largest city in Germany, after Berlin and ...
, Germany. His PhD advisors were Wilfried Brauer and Klaus Schulten. He taught there from 2004 until 2009 when he became a professor of artificial intelligence at the Università della Svizzera Italiana in
Lugano Lugano (, , ; lmo, label= Ticinese, Lugan ) is a city and municipality in Switzerland, part of the Lugano District in the canton of Ticino. It is the largest city of both Ticino and the Italian-speaking southern Switzerland. Lugano has a populat ...
, Switzerland.


Work

With his students
Sepp Hochreiter Josef "Sepp" Hochreiter (born 14 February 1967) is a German computer scientist. Since 2018 he has led the Institute for Machine Learning at the Johannes Kepler University of Linz after having led the Institute of Bioinformatics from 2006 to 2018 ...
,
Felix Gers Felix Gers is a professor of computer science at Berlin University of Applied Sciences Berlin. With Jürgen Schmidhuber and Fred Cummins, he introduced the forget gate to the long short-term memory Long short-term memory (LSTM) is an artificia ...
, Fred Cummins, Alex Graves, and others, Schmidhuber published increasingly sophisticated versions of a type of recurrent neural network called the
long short-term memory Long short-term memory (LSTM) is an artificial neural network used in the fields of artificial intelligence and deep learning. Unlike standard feedforward neural networks, LSTM has feedback connections. Such a recurrent neural network (RNN) c ...
(LSTM). First results were already reported in Hochreiter's diploma thesis (1991) which analyzed and overcame the famous
vanishing gradient In machine learning, the vanishing gradient problem is encountered when training artificial neural networks with gradient-based learning methods and backpropagation. In such methods, during each iteration of training each of the neural network's ...
problem. The name LSTM was introduced in a tech report (1995) leading to the most cited LSTM publication (1997). The standard LSTM architecture which is used in almost all current applications was introduced in 2000. Today's "vanilla LSTM" using backpropagation through time was published in 2005, and its connectionist temporal classification (CTC) training algorithm in 2006. CTC enabled end-to-end speech recognition with LSTM. In 2015, LSTM trained by CTC was used in a new implementation of
speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ...
in Google's software for
smartphones A smartphone is a portable computer device that combines mobile telephone and computing functions into one unit. They are distinguished from feature phones by their stronger hardware capabilities and extensive mobile operating systems, whic ...
. Google also used LSTM for the smart assistant Allo and for Google Translate.
Apple An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple trees are cultivated worldwide and are the most widely grown species in the genus '' Malus''. The tree originated in Central Asia, where its wild ancest ...
used LSTM for the "Quicktype" function on the iPhone and for Siri.
Amazon Amazon most often refers to: * Amazons, a tribe of female warriors in Greek mythology * Amazon rainforest, a rainforest covering most of the Amazon basin * Amazon River, in South America * Amazon (company), an American multinational technolog ...
used LSTM for
Amazon Alexa Amazon Alexa, also known simply as Alexa, is a virtual assistant technology largely based on a Polish speech synthesiser named Ivona, bought by Amazon in 2013. It was first used in the Amazon Echo smart speaker and the Echo Dot, Echo Studio ...
. In 2017, Facebook performed some 4.5 billion automatic translations every day using LSTM networks. Bloomberg Business Week wrote: "These powers make LSTM arguably the most commercial AI achievement, used for everything from predicting diseases to composing music." In 2011, Schmidhuber's team at IDSIA with his postdoc Dan Ciresan also achieved dramatic speedups of
convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
s (CNNs) on fast parallel computers called GPUs. An earlier CNN on GPU by Chellapilla et al. (2006) was 4 times faster than an equivalent implementation on CPU. The deep CNN of Dan Ciresan et al. (2011) at IDSIA was already 60 times faster and achieved the first superhuman performance in a computer vision contest in August 2011. Between 15 May 2011 and 10 September 2012, their fast and deep CNNs won no fewer than four image competitions. They also significantly improved on the best performance in the literature for multiple image
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases ...
s. The approach has become central to the field of
computer vision Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the human ...
. It is based on CNN designs introduced much earlier by Yann LeCun et al. (1989)Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel
Backpropagation Applied to Handwritten Zip Code Recognition
AT&T Bell Laboratories
who applied the backpropagation algorithm to a variant of Kunihiko Fukushima's original CNN architecture called neocognitron, later modified by J. Weng's method called max-pooling. In 2014, Schmidhuber formed a company, Nnaisense, to work on commercial applications of artificial intelligence in fields such as finance, heavy industry and self-driving cars. Sepp Hochreiter, Jaan Tallinn, and Marcus Hutter are advisers to the company. Sales were under US$11 million in 2016; however, Schmidhuber states that the current emphasis is on research and not revenue. Nnaisense raised its first round of capital funding in January 2017. Schmidhuber's overall goal is to create an all-purpose AI by training a single AI in sequence on a variety of narrow tasks.


Views

According to
The Guardian ''The Guardian'' is a British daily newspaper. It was founded in 1821 as ''The Manchester Guardian'', and changed its name in 1959. Along with its sister papers '' The Observer'' and '' The Guardian Weekly'', ''The Guardian'' is part of the ...
, Schmidhuber complained in a "scathing 2015 article" that fellow
deep learning Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised. ...
researchers Geoffrey Hinton, Yann LeCun and
Yoshua Bengio Yoshua Bengio (born March 5, 1964) is a Canadian computer scientist, most noted for his work on artificial neural networks and deep learning. He is a professor at the Department of Computer Science and Operations Research at the Université de ...
"heavily cite each other," but "fail to credit the pioneers of the field", allegedly understating the contributions of Schmidhuber and other early machine learning pioneers including Alexey Grigorevich Ivakhnenko who published the first
deep learning Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised. ...
networks already in 1965. LeCun denied the charge, stating instead that Schmidhuber "keeps claiming credit he doesn't deserve". Schmidhuber replied that LeCun did not provide a single example for his statement, and listed several priority disputes.


Recognition

Schmidhuber received the Helmholtz Award of the International Neural Network Society in 2013, and the Neural Networks Pioneer Award of the
IEEE Computational Intelligence Society The Institute of Electrical and Electronics Engineers (IEEE) is a 501(c)(3) professional association for electronic engineering and electrical engineering (and associated disciplines) with its corporate office in New York City and its operatio ...
in 2016 for "pioneering contributions to deep learning and neural networks." He is a member of the
European Academy of Sciences and Arts The European Academy of Sciences and Arts (EASA, la, Academia Scientiarum et Artium Europaea) is a transnational and interdisciplinary network, connecting about 2,000 recommended scientists and artists worldwide, including 37 Nobel Prize laur ...
.


References

{{DEFAULTSORT:Schmidhuber, Jurgen Living people Artificial intelligence researchers Machine learning researchers Computer scientists Members of the European Academy of Sciences and Arts Technical University of Munich alumni Technical University of Munich faculty University of Lugano faculty 1963 births