A discovery system is an
artificial intelligence
Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...
system that attempts to discover new scientific concepts or laws. The aim of discovery systems is to automate scientific data analysis and the scientific discovery process. Ideally, an artificial intelligence system should be able to search systematically through the space of all possible hypotheses and yield the hypothesis - or set of equally likely hypotheses - that best describes the complex patterns in data.
During the era known as the second AI summer (approximately 1978-1987), various systems akin to the era's dominant
expert systems
In artificial intelligence (AI), an expert system is a computer system emulating the decision-making ability of a human expert.
Expert systems are designed to solve complex problems by Automated reasoning system, reasoning through bodies of knowl ...
were developed to tackle the problem of extracting scientific hypotheses from data, with or without interacting with a human scientist. These systems included
Autoclass,
Automated Mathematician
The Automated Mathematician (AM) is one of the earliest successful discovery systems. It was created by Douglas Lenat in Lisp, and in 1977 led to Lenat being awarded the IJCAI Computers and Thought Award.
AM worked by generating and modifying s ...
,
Eurisko
Eurisko ( Gr., ''I discover'') is a discovery system written by Douglas Lenat in RLL-1, a representation language itself written in the Lisp programming language. A sequel to Automated Mathematician, it consists of heuristics, i.e. rules of thu ...
, which aimed at general-purpose hypothesis discovery, and more specific systems such as
Dalton
Dalton may refer to:
Science
* Dalton (crater), a lunar crater
* Dalton (program), chemistry software
* Dalton (unit) (Da), a.k.a. unified atomic mass unit
* John Dalton, chemist, physicist and meteorologist
* 12292 Dalton, an asteroid
Ent ...
, which uncovers molecular properties from data.
The dream of building systems that discover scientific hypotheses was pushed to the background with the second AI winter and the subsequent resurgence of subsymbolic methods such as
neural networks
A neural network is a group of interconnected units called neurons that send signals to one another. Neurons can be either Cell (biology), biological cells or signal pathways. While individual neurons are simple, many of them together in a netwo ...
. Subsymbolic methods emphasize prediction over explanation, and yield models which works well but are difficult or impossible to explain which has earned them the name
black box
In science, computing, and engineering, a black box is a system which can be viewed in terms of its inputs and outputs (or transfer characteristics), without any knowledge of its internal workings. Its implementation is "opaque" (black). The te ...
AI. A black-box model cannot be considered a scientific hypothesis, and this development has even led some researchers to suggest that the traditional aim of science - to uncover hypotheses and theories about the structure of reality - is obsolete. Other researchers disagree and argue that subsymbolic methods are useful in many cases, just not for generating scientific theories.
Discovery systems from the 1970s and 1980s
*
Autoclass was a Bayesian Classification System written in 1986
*
Automated Mathematician
The Automated Mathematician (AM) is one of the earliest successful discovery systems. It was created by Douglas Lenat in Lisp, and in 1977 led to Lenat being awarded the IJCAI Computers and Thought Award.
AM worked by generating and modifying s ...
was one of the earliest successful discovery systems. It was written in 1977 and worked by generating a modifying small
Lisp
Lisp (historically LISP, an abbreviation of "list processing") is a family of programming languages with a long history and a distinctive, fully parenthesized Polish notation#Explanation, prefix notation.
Originally specified in the late 1950s, ...
programs
*
Eurisko
Eurisko ( Gr., ''I discover'') is a discovery system written by Douglas Lenat in RLL-1, a representation language itself written in the Lisp programming language. A sequel to Automated Mathematician, it consists of heuristics, i.e. rules of thu ...
was a Sequel to Automated Mathematician written in 1984
*
Dalton
Dalton may refer to:
Science
* Dalton (crater), a lunar crater
* Dalton (program), chemistry software
* Dalton (unit) (Da), a.k.a. unified atomic mass unit
* John Dalton, chemist, physicist and meteorologist
* 12292 Dalton, an asteroid
Ent ...
is a still maintained program capable of calculating various molecular properties initially launched in 1983 and available in open source since 2017
*
Glauber is a scientific discovery method written in the context of computational philosophy of science launched in 1983
Modern discovery systems (2009–present)
After a couple of decades with little interest in discovery systems, the interest in using AI to uncover natural laws and scientific explanations was renewed by the work of Michael Schmidt, then a PhD student in Computational Biology at
Cornell University
Cornell University is a Private university, private Ivy League research university based in Ithaca, New York, United States. The university was co-founded by American philanthropist Ezra Cornell and historian and educator Andrew Dickson W ...
. Schmidt and his advisor,
Hod Lipson
Hod Lipson (born 1967) is an Israeli - American robotics engineer. He is the director of Columbia University's Creative Machines Lab. Lipson's work focuses on evolutionary robotics, design automation, rapid prototyping, artificial life, and creat ...
, invented
Eureqa, which they described as a
symbolic regression approach to "distilling free-form natural laws from experimental data". This work effectively demonstrated that symbolic regression was a promising way forward for AI-driven scientific discovery.
Since 2009, symbolic regression has matured further, and today, various commercial and open source systems are actively used in scientific research. Notable examples include Eureqa, now a part of DataRobot AI Cloud Platform, AI Feynman, and
QLattice.
References
External links
The AI revolution in scientific research
Applications of artificial intelligence
Data mining
Machine learning
{{math-stub