HOME

TheInfoList



OR:

The following outline is provided as an overview of and topical guide to machine learning.
Machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
is a subfield of
soft computing Soft computing is a set of algorithms, including neural networks, fuzzy logic, and evolutionary algorithms. These algorithms are tolerant of imprecision, uncertainty, partial truth and approximation. It is contrasted with hard computing: al ...
within
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
that evolved from the study of
pattern recognition Pattern recognition is the automated recognition of patterns and regularities in data. It has applications in statistical data analysis, signal processing, image analysis, information retrieval, bioinformatics, data compression, computer graphics ...
and computational learning theory in
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech ...
.http://www.britannica.com/EBchecked/topic/1116194/machine-learning In 1959, Arthur Samuel defined machine learning as a "field of study that gives computers the ability to learn without being explicitly programmed". Machine learning explores the study and construction of
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
s that can
learn Learning is the process of acquiring new understanding, knowledge, behaviors, skills, values, attitudes, and preferences. The ability to learn is possessed by humans, animals, and some machines; there is also evidence for some kind of l ...
from and make predictions on
data In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpret ...
. Such algorithms operate by building a model from an example
training set In machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from ...
of input observations in order to make data-driven predictions or decisions expressed as outputs, rather than following strictly static program instructions.


What ''type'' of thing is machine learning?

* An
academic discipline An academy (Attic Greek: Ἀκαδήμεια; Koine Greek Ἀκαδημία) is an institution of secondary education, secondary or tertiary education, tertiary higher education, higher learning (and generally also research or honorary membershi ...
* A branch of
science Science is a systematic endeavor that builds and organizes knowledge in the form of testable explanations and predictions about the universe. Science may be as old as the human species, and some of the earliest archeological evidence ...
** An
applied science Applied science is the use of the scientific method and knowledge obtained via conclusions from the method to attain practical goals. It includes a broad range of disciplines such as engineering and medicine. Applied science is often contrasted ...
*** A subfield of
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
**** A branch of
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech ...
**** A subfield of soft computing *** Application of
statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...


Branches of machine learning


Subfields of machine learning

* Computational learning theory – studying the design and analysis of
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
algorithms. * Grammar induction * Meta-learning


Cross-disciplinary fields involving machine learning

* Adversarial machine learning *
Predictive analytics Predictive analytics encompasses a variety of statistical techniques from data mining, predictive modeling, and machine learning that analyze current and historical facts to make predictions about future or otherwise unknown events. In busine ...
*
Quantum machine learning Quantum machine learning is the integration of quantum algorithms within machine learning programs. The most common use of the term refers to machine learning algorithms for the analysis of classical data executed on a quantum computer, i.e. quan ...
* Robot learning ** Developmental robotics


Applications of machine learning

* Applications of machine learning *
Bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
*
Biomedical informatics Health informatics is the field of science and engineering that aims at developing methods and technologies for the acquisition, processing, and study of patient data, which can come from different sources and modalities, such as electronic hea ...
*
Computer vision Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the human ...
*
Customer relationship management Customer relationship management (CRM) is a process in which a business or other organization administers its interactions with customers, typically using data analysis to study large amounts of information. CRM systems compile data from a r ...
– * Data mining *
Earth sciences Earth science or geoscience includes all fields of natural science related to the planet Earth. This is a branch of science dealing with the physical, chemical, and biological complex constitutions and synergistic linkages of Earth's four spheres ...
* Email filtering * Inverted pendulum – balance and equilibrium system. *
Natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...
(NLP) **
Named Entity Recognition Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre ...
**
Automatic summarization Automatic summarization is the process of shortening a set of data computationally, to create a subset (a summary) that represents the most important or relevant information within the original content. Artificial intelligence algorithms are comm ...
** Automatic taxonomy construction ** Dialog system ** Grammar checker ** Language recognition ***
Handwriting recognition Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other de ...
***
Optical character recognition Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a sc ...
***
Speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ...
**** Text to Speech Synthesis (TTS) **** Speech Emotion Recognition (SER) **
Machine translation Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates ...
**
Question answering Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP), which is concerned with building systems that automatically answer questions posed by humans in a natural ...
**
Speech synthesis Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal langua ...
**
Text mining Text mining, also referred to as ''text data mining'', similar to text analytics, is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extract ...
***
Term frequency–inverse document frequency Term may refer to: *Terminology, or term, a noun or compound word used in a specific context, in particular: **Technical term, part of the specialized vocabulary of a particular field, specifically: ***Scientific terminology, terms used by scienti ...
(tf–idf) **
Text simplification Text simplification is an operation used in natural language processing to change, enhance, classify, or otherwise process an existing body of human-readable text so its grammar and structure is greatly simplified while the underlying meaning and ...
*
Pattern recognition Pattern recognition is the automated recognition of patterns and regularities in data. It has applications in statistical data analysis, signal processing, image analysis, information retrieval, bioinformatics, data compression, computer graphics ...
**
Facial recognition system A facial recognition system is a technology capable of matching a human face from a digital image or a video frame against a database of faces. Such a system is typically employed to authenticate users through ID verification services, and ...
**
Handwriting recognition Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other de ...
** Image recognition **
Optical character recognition Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a sc ...
**
Speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ...
* Recommendation system **
Collaborative filtering Collaborative filtering (CF) is a technique used by recommender systems.Francesco Ricci and Lior Rokach and Bracha ShapiraIntroduction to Recommender Systems Handbook Recommender Systems Handbook, Springer, 2011, pp. 1-35 Collaborative filtering ...
** Content-based filtering ** Hybrid recommender systems (Collaborative and content-based filtering) *
Search engine A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...
**
Search engine optimization Search engine optimization (SEO) is the process of improving the quality and quantity of website traffic to a website or a web page from search engines. SEO targets unpaid traffic (known as "natural" or "organic" results) rather than dire ...
* Social Engineering


Machine learning hardware

*
Graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, m ...
*
Tensor processing unit Tensor Processing Unit (TPU) is an AI accelerator application-specific integrated circuit (ASIC) developed by Google for neural network machine learning, using Google's own TensorFlow software. Google began using TPUs internally in 2015, and ...
*
Vision processing unit A vision processing unit (VPU) is (as of 2018) an emerging class of microprocessor; it is a specific type of AI accelerator, designed to accelerate machine vision tasks. Overview Vision processing units are distinct from video processing uni ...


Machine learning tools

*
Comparison of deep learning software The following table compares notable software frameworks, libraries and computer programs for deep learning. Deep-learning software by name Comparison of compatibility of machine learning models See also *Comparison of numerical-analy ...


Machine learning frameworks


Proprietary machine learning frameworks

* Amazon Machine Learning * Microsoft Azure Machine Learning Studio *
DistBelief TensorFlow is a Free and open-source software, free and open-source Library (computing), software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on Types of artificial ...
– replaced by TensorFlow


Open source machine learning frameworks

* Apache Singa * Apache MXNet * Caffe * PyTorch * mlpack *
TensorFlow TensorFlow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks. "It is machine learnin ...
*
Torch A torch is a stick with combustible material at one end, which is ignited and used as a light source. Torches have been used throughout history, and are still used in processions, symbolic and religious events, and in juggling entertainment. I ...
* CNTK * Accord.Net * Jax


Machine learning libraries

*
Deeplearning4j Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, ...
*
Theano In Greek mythology, Theano (; Ancient Greek: Θεανώ) may refer to the following personages: *Theano, wife of Metapontus, king of Icaria. Metapontus demanded that she bear him children, or leave the kingdom. She presented the children of Mel ...
* scikit-learn * Keras


Machine learning algorithms

* Almeida–Pineda recurrent backpropagation * ALOPEX * Backpropagation * Bootstrap aggregating *
CN2 algorithm The CN2 induction algorithm is a learning algorithm for rule induction.Clark, P. and Niblett, T (1989) The CN2 induction algorithm. Machine Learning 3(4):261-283. It is designed to work even when the training data is imperfect. It is based on ide ...
* Constructing skill trees * Dehaene–Changeux model *
Diffusion map Diffusion maps is a dimensionality reduction or feature extraction algorithm introduced by Ronald Coifman, Coifman and Lafon which computes a family of embeddings of a data set into Euclidean space (often low-dimensional) whose coordinates can be ...
*
Dominance-based rough set approach The dominance-based rough set approach (DRSA) is an extension of rough set theory for multi-criteria decision analysis (MCDA), introduced by Greco, Matarazzo and Słowiński. Greco, S., Matarazzo, B., Słowiński, R.: Rough sets theory for multi- ...
*
Dynamic time warping In time series analysis, dynamic time warping (DTW) is an algorithm for measuring similarity between two temporal sequences, which may vary in speed. For instance, similarities in walking could be detected using DTW, even if one person was walk ...
* Error-driven learning * Evolutionary multimodal optimization * Expectation–maximization algorithm *
FastICA FastICA is an efficient and popular algorithm for independent component analysis invented by Aapo Hyvärinen at Helsinki University of Technology. Like most ICA algorithms, FastICA seeks an orthogonal rotation of prewhitened data, through a fixed- ...
*
Forward–backward algorithm The forward–backward algorithm is an inference algorithm for hidden Markov models which computes the posterior marginals of all hidden state variables given a sequence of observations/emissions o_:= o_1,\dots,o_T, i.e. it computes, for all hi ...
* GeneRec * Genetic Algorithm for Rule Set Production * Growing self-organizing map * Hyper basis function network * IDistance * K-nearest neighbors algorithm * Kernel methods for vector output * Kernel principal component analysis * Leabra *
Linde–Buzo–Gray algorithm The Linde–Buzo–Gray algorithm (introduced by Yoseph Linde, Andrés Buzo and Robert M. Gray in 1980) is a vector quantization algorithm to derive a good codebook A codebook is a type of document used for gathering and storing cryptography ...
*
Local outlier factor In anomaly detection, the local outlier factor (LOF) is an algorithm proposed by Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng and Jörg Sander in 2000 for finding anomalous data points by measuring the local deviation of a given data point ...
* Logic learning machine * LogitBoost *
Manifold alignment Manifold alignment is a class of machine learning algorithms that produce projections between sets of data, given that the original data sets lie on a common manifold. The concept was first introduced as such by Ham, Lee, and Saul in 2003, adding ...
* Markov chain Monte Carlo (MCMC) * Minimum redundancy feature selection *
Mixture of experts Mixture of experts (MoE) refers to a machine learning technique where multiple expert networks (learners) are used to divide a problem space into homogeneous regions. It differs from ensemble techniques in that typically only a few, or 1, expert mo ...
*
Multiple kernel learning Multiple kernel learning refers to a set of machine learning methods that use a predefined set of kernels and learn an optimal linear or non-linear combination of kernels as part of the algorithm. Reasons to use multiple kernel learning include ...
* Non-negative matrix factorization *
Online machine learning In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques whi ...
* Out-of-bag error * Prefrontal cortex basal ganglia working memory * PVLV * Q-learning * Quadratic unconstrained binary optimization * Query-level feature *
Quickprop Quickprop is an iterative method for determining the minimum of the loss function of an artificial neural network, following an algorithm inspired by the Newton's method. Sometimes, the algorithm is classified to the group of the second order lear ...
*
Radial basis function network In the field of mathematical modeling, a radial basis function network is an artificial neural network that uses radial basis functions as activation functions. The output of the network is a linear combination of radial basis functions of the inp ...
* Randomized weighted majority algorithm * Reinforcement learning * Repeated incremental pruning to produce error reduction (RIPPER) * Rprop *
Rule-based machine learning Rule-based machine learning (RBML) is a term in computer science intended to encompass any machine learning method that identifies, learns, or evolves 'rules' to store, manipulate or apply. The defining characteristic of a rule-based machine lear ...
* Skill chaining * Sparse PCA * State–action–reward–state–action *
Stochastic gradient descent Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties (e.g. differentiable or subdifferentiable). It can be regarded as a stochastic approximation of ...
* Structured kNN *
T-distributed stochastic neighbor embedding t-distributed stochastic neighbor embedding (t-SNE) is a statistical method for visualizing high-dimensional data by giving each datapoint a location in a two or three-dimensional map. It is based on Stochastic Neighbor Embedding originally de ...
* Temporal difference learning * Wake-sleep algorithm *
Weighted majority algorithm (machine learning) In machine learning, weighted majority algorithm (WMA) is a meta learning algorithm used to construct a compound algorithm from a pool of prediction algorithms, which could be any type of learning algorithms, classifiers, or even real human exper ...


Machine learning methods


Instance-based algorithm

* K-nearest neighbors algorithm (KNN) * Learning vector quantization (LVQ) * Self-organizing map (SOM)


Regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...

*
Logistic regression In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables. In regression an ...
* Ordinary least squares regression (OLSR) *
Linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...
*
Stepwise regression In statistics, stepwise regression is a method of fitting regression models in which the choice of predictive variables is carried out by an automatic procedure. In each step, a variable is considered for addition to or subtraction from the set of ...
* Multivariate adaptive regression splines (MARS) * Regularization algorithm **
Ridge regression Ridge regression is a method of estimating the coefficients of multiple-regression models in scenarios where the independent variables are highly correlated. It has been used in many fields including econometrics, chemistry, and engineering. Also ...
** Least Absolute Shrinkage and Selection Operator (LASSO) ** Elastic net **
Least-angle regression In statistics, least-angle regression (LARS) is an algorithm for fitting linear regression models to high-dimensional data, developed by Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani. Suppose we expect a response variab ...
(LARS) * Classifiers ** Probabilistic classifier ***
Naive Bayes classifier In statistics, naive Bayes classifiers are a family of simple " probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features (see Bayes classifier). They are among the simplest Bay ...
**
Binary classifier Binary classification is the task of classifying the elements of a set into two groups (each called ''class'') on the basis of a classification rule. Typical binary classification problems include: * Medical testing to determine if a patient has ...
**
Linear classifier In the field of machine learning, the goal of statistical classification is to use an object's characteristics to identify which class (or group) it belongs to. A linear classifier achieves this by making a classification decision based on the v ...
** Hierarchical classifier


Dimensionality reduction

Dimensionality reduction Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...
*
Canonical correlation analysis In statistics, canonical-correlation analysis (CCA), also called canonical variates analysis, is a way of inferring information from cross-covariance matrices. If we have two vectors ''X'' = (''X''1, ..., ''X'n'') and ''Y' ...
(CCA) *
Factor analysis Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observed ...
* Feature extraction * Feature selection *
Independent component analysis In signal processing, independent component analysis (ICA) is a computational method for separating a multivariate signal into additive subcomponents. This is done by assuming that at most one subcomponent is Gaussian and that the subcomponents ar ...
(ICA) *
Linear discriminant analysis Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features ...
(LDA) *
Multidimensional scaling Multidimensional scaling (MDS) is a means of visualizing the level of similarity of individual cases of a dataset. MDS is used to translate "information about the pairwise 'distances' among a set of n objects or individuals" into a configurati ...
(MDS) * Non-negative matrix factorization (NMF) *
Partial least squares regression Partial least squares regression (PLS regression) is a statistical method that bears some relation to principal components regression; instead of finding hyperplanes of maximum variance between the response and independent variables, it finds a ...
(PLSR) *
Principal component analysis Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while preserving the maximum amount of information, and ...
(PCA) * Principal component regression (PCR) *
Projection pursuit Projection pursuit (PP) is a type of statistical technique which involves finding the most "interesting" possible projections in multidimensional data. Often, projections which deviate more from a normal distribution are considered to be more inter ...
* Sammon mapping *
t-distributed stochastic neighbor embedding t-distributed stochastic neighbor embedding (t-SNE) is a statistical method for visualizing high-dimensional data by giving each datapoint a location in a two or three-dimensional map. It is based on Stochastic Neighbor Embedding originally de ...
(t-SNE)


Ensemble learning

Ensemble learning * AdaBoost * Boosting * Bootstrap aggregating (Bagging) *
Ensemble averaging In machine learning, particularly in the creation of artificial neural networks, ensemble averaging is the process of creating multiple models and combining them to produce a desired output, as opposed to creating just one model. Frequently an ens ...
– process of creating multiple models and combining them to produce a desired output, as opposed to creating just one model. Frequently an ensemble of models performs better than any individual model, because the various errors of the models "average out." * Gradient boosted decision tree (GBDT) * Gradient boosting machine (GBM) *
Random Forest Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of ...
* Stacked Generalization (blending)


Meta-learning

Meta-learning * Inductive bias *
Metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...


Reinforcement learning

Reinforcement learning * Q-learning * State–action–reward–state–action (SARSA) * Temporal difference learning (TD) *
Learning Automata A learning automaton is one type of machine learning algorithm studied since 1970s. Learning automata select their current action based on past experiences from the environment. It will fall into the range of reinforcement learning if the environme ...


Supervised learning

Supervised learning * Averaged one-dependence estimators (AODE) *
Artificial neural network Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...
*
Case-based reasoning In artificial intelligence and philosophy, case-based reasoning (CBR), broadly construed, is the process of solving new problems based on the solutions of similar past problems. In everyday life, an auto mechanic who fixes an engine by recallin ...
*
Gaussian process regression In statistics, originally in geostatistics, kriging or Kriging, also known as Gaussian process regression, is a method of interpolation based on Gaussian process governed by prior covariances. Under suitable assumptions of the prior, kriging giv ...
* Gene expression programming * Group method of data handling (GMDH) * Inductive logic programming *
Instance-based learning In machine learning, instance-based learning (sometimes called memory-based learning) is a family of learning algorithms that, instead of performing explicit generalization, compare new problem instances with instances seen in training, which have ...
*
Lazy learning In machine learning, lazy learning is a learning method in which generalization of the training data is, in theory, delayed until a query is made to the system, as opposed to eager learning, where the system tries to generalize the training data be ...
*
Learning Automata A learning automaton is one type of machine learning algorithm studied since 1970s. Learning automata select their current action based on past experiences from the environment. It will fall into the range of reinforcement learning if the environme ...
* Learning Vector Quantization * Logistic Model Tree * Minimum message length (decision trees, decision graphs, etc.) ** Nearest Neighbor Algorithm **
Analogical modeling Analogical modeling (AM) is a formal theory of exemplar based analogical reasoning, proposed by Royal Skousen, professor of Linguistics and English language at Brigham Young University in Provo, Utah. It is applicable to language modeling and othe ...
* Probably approximately correct learning (PAC) learning *
Ripple down rules Ripple-down rules (RDR) are a way of approaching knowledge acquisition. Knowledge acquisition refers to the transfer of knowledge from human experts to knowledge-based systems. Introductory material Ripple-down rules are an incremental approac ...
, a knowledge acquisition methodology * Symbolic machine learning algorithms *
Support vector machine In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laborat ...
s * Random Forests * Ensembles of classifiers ** Bootstrap aggregating (bagging) ** Boosting (meta-algorithm) * Ordinal classification * Information fuzzy networks (IFN) * Conditional Random Field * ANOVA *
Quadratic classifier In statistics, a quadratic classifier is a statistical classifier that uses a quadratic decision surface to separate measurements of two or more classes of objects or events. It is a more general version of the linear classifier. The classifica ...
s *
k-nearest neighbor In statistics, the ''k''-nearest neighbors algorithm (''k''-NN) is a non-parametric supervised learning method first developed by Evelyn Fix and Joseph Hodges in 1951, and later expanded by Thomas Cover. It is used for classification and regressi ...
* Boosting ** SPRINT *
Bayesian network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Ba ...
s ** Naive Bayes *
Hidden Markov model A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ...
s **
Hierarchical hidden Markov model The hierarchical hidden Markov model (HHMM) is a statistical model derived from the hidden Markov model (HMM). In an HHMM, each state is considered to be a self-contained probabilistic model. More precisely, each state of the HHMM is itself an HHMM ...


Bayesian

Bayesian statistics Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...
* Bayesian knowledge base * Naive Bayes *
Gaussian Naive Bayes In statistics, naive Bayes classifiers are a family of simple " probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features (see Bayes classifier). They are among the simplest Baye ...
* Multinomial Naive Bayes * Averaged One-Dependence Estimators (AODE) * Bayesian Belief Network (BBN) *
Bayesian Network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Ba ...
(BN)


Decision tree algorithms

Decision tree algorithm *
Decision tree A decision tree is a decision support tool that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only contains con ...
* Classification and regression tree (CART) * Iterative Dichotomiser 3 (ID3) * C4.5 algorithm * C5.0 algorithm * Chi-squared Automatic Interaction Detection (CHAID) * Decision stump * Conditional decision tree *
ID3 algorithm In decision tree learning, ID3 (Iterative Dichotomiser 3) is an algorithm invented by Ross QuinlanQuinlan, J. R. 1986. Induction of Decision Trees. Mach. Learn. 1, 1 (Mar. 1986), 81–106 used to generate a decision tree from a dataset. ID3 is th ...
*
Random forest Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of ...
* SLIQ


Linear classifier

Linear classifier In the field of machine learning, the goal of statistical classification is to use an object's characteristics to identify which class (or group) it belongs to. A linear classifier achieves this by making a classification decision based on the v ...
* Fisher's linear discriminant *
Linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...
*
Logistic regression In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables. In regression an ...
*
Multinomial logistic regression In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes. That is, it is a model that is used to predict the prob ...
*
Naive Bayes classifier In statistics, naive Bayes classifiers are a family of simple " probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features (see Bayes classifier). They are among the simplest Bay ...
* Perceptron *
Support vector machine In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laborat ...


Unsupervised learning

Unsupervised learning * Expectation-maximization algorithm *
Vector Quantization Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. It was originally used for data compression. It works by di ...
* Generative topographic map *
Information bottleneck method The information bottleneck method is a technique in information theory introduced by Naftali Tishby, Fernando C. Pereira, and William Bialek. It is designed for finding the best tradeoff between accuracy and complexity ( compression) when summarizi ...
*
Association rule learning Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.Pi ...
algorithms **
Apriori algorithm AprioriRakesh Agrawal and Ramakrishnan SrikanFast algorithms for mining association rules Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pages 487-499, Santiago, Chile, September 1994. is an algorithm for frequent ...
**
Eclat algorithm Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.Pi ...


Artificial neural networks

Artificial neural network Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...
*
Feedforward neural network A feedforward neural network (FNN) is an artificial neural network wherein connections between the nodes do ''not'' form a cycle. As such, it is different from its descendant: recurrent neural networks. The feedforward neural network was the ...
**
Extreme learning machine Extreme learning machines are feedforward neural networks for classification, regression, clustering, sparse approximation, compression and feature learning with a single layer or multiple layers of hidden nodes, where the parameters of hidden n ...
**
Convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
* Recurrent neural network ** Long short-term memory (LSTM) * Logic learning machine * Self-organizing map


Association rule learning

Association rule learning Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.Pi ...
*
Apriori algorithm AprioriRakesh Agrawal and Ramakrishnan SrikanFast algorithms for mining association rules Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pages 487-499, Santiago, Chile, September 1994. is an algorithm for frequent ...
*
Eclat algorithm Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.Pi ...
* FP-growth algorithm


Hierarchical clustering

Hierarchical clustering In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into t ...
* Single-linkage clustering *
Conceptual clustering Conceptual clustering is a machine learning paradigm for unsupervised classification that has been defined by Ryszard S. Michalski in 1980 (Fisher 1987, Michalski 1980) and developed mainly during the 1980s. It is distinguished from ordinary dat ...


Cluster analysis

Cluster analysis Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of ...
*
BIRCH A birch is a thin-leaved deciduous hardwood tree of the genus ''Betula'' (), in the family Betulaceae, which also includes alders, hazels, and hornbeams. It is closely related to the beech- oak family Fagaceae. The genus ''Betula'' cont ...
* DBSCAN * Expectation-maximization (EM) *
Fuzzy clustering Fuzzy clustering (also referred to as soft clustering or soft ''k''-means) is a form of clustering in which each data point can belong to more than one cluster. Clustering or cluster analysis involves assigning data points to clusters such that i ...
*
Hierarchical Clustering In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into t ...
*
K-means clustering ''k''-means clustering is a method of vector quantization, originally from signal processing, that aims to partition ''n'' observations into ''k'' clusters in which each observation belongs to the cluster with the nearest mean (cluster centers ...
* K-medians * Mean-shift * OPTICS algorithm


Anomaly detection

Anomaly detection * ''k''-nearest neighbors algorithm (''k''-NN) *
Local outlier factor In anomaly detection, the local outlier factor (LOF) is an algorithm proposed by Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng and Jörg Sander in 2000 for finding anomalous data points by measuring the local deviation of a given data point ...


Semi-supervised learning

Semi-supervised learning *
Active learning Active learning is "a method of learning in which students are actively or experientially involved in the learning process and where there are different levels of active learning, depending on student involvement." states that "students partici ...
– special case of semi-supervised learning in which a learning algorithm is able to interactively query the user (or some other information source) to obtain the desired outputs at new data points. * Generative models * Low-density separation * Graph-based methods *
Co-training Co-training is a machine learning algorithm used when there are only small amounts of labeled data and large amounts of unlabeled data. One of its uses is in text mining for search engines. It was introduced by Avrim Blum and Tom Mitchell in 1998. ...
* Transduction


Deep learning

Deep learning Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised. ...
* Deep belief networks * Deep
Boltzmann machine A Boltzmann machine (also called Sherrington–Kirkpatrick model with external field or stochastic Ising–Lenz–Little model) is a stochastic spin-glass model with an external field, i.e., a Sherrington–Kirkpatrick model, that is a stochastic ...
s * Deep
Convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
s * Deep Recurrent neural networks *
Hierarchical temporal memory Hierarchical temporal memory (HTM) is a biologically constrained machine intelligence technology developed by Numenta. Originally described in the 2004 book ''On Intelligence'' by Jeff Hawkins with Sandra Blakeslee, HTM is primarily used today for ...
* Generative Adversarial Network ** Style transfer *
Transformer A transformer is a passive component that transfers electrical energy from one electrical circuit to another circuit, or multiple circuits. A varying current in any coil of the transformer produces a varying magnetic flux in the transformer' ...
* Stacked Auto-Encoders


Other machine learning methods and problems

* Anomaly detection * Association rules * Bias-variance dilemma *
Classification Classification is a process related to categorization, the process in which ideas and objects are recognized, differentiated and understood. Classification is the grouping of related facts into classes. It may also refer to: Business, organizat ...
**
Multi-label classification In machine learning, multi-label classification or multi-output classification is a variant of the classification problem where multiple nonexclusive labels may be assigned to each instance. Multi-label classification is a generalization of mult ...
* Clustering *
Data Pre-processing Data preprocessing can refer to manipulation or dropping of data before it is used in order to ensure or enhance performance, and is an important step in the data mining process. The phrase "garbage in, garbage out" is particularly applicable to ...
* Empirical risk minimization *
Feature engineering Feature engineering or feature extraction or feature discovery is the process of using domain knowledge to extract features (characteristics, properties, attributes) from raw data. The motivation is to use these extra features to improve the qua ...
* Feature learning * Learning to rank *
Occam learning In computational learning theory, Occam learning is a model of algorithmic learning where the objective of the learner is to output a succinct representation of received training data. This is closely related to probably approximately correct (P ...
*
Online machine learning In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques whi ...
* PAC learning * Regression * Reinforcement Learning * Semi-supervised learning *
Statistical learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
*
Structured prediction Structured prediction or structured (output) learning is an umbrella term for supervised machine learning techniques that involves predicting structured objects, rather than scalar discrete or real values. Similar to commonly used supervised l ...
** Graphical models ***
Bayesian network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Ba ...
*** Conditional random field (CRF) ***
Hidden Markov model A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ...
(HMM) * Unsupervised learning * VC theory


Machine learning research

* List of artificial intelligence projects *
List of datasets for machine learning research These datasets are applied for machine learning research and have been cited in peer-reviewed academic journals. Datasets are an integral part of the field of machine learning. Major advances in this field can result from advances in learning a ...


History of machine learning

History of machine learning * Timeline of machine learning


Machine learning projects

Machine learning projects *
DeepMind DeepMind Technologies is a British artificial intelligence subsidiary of Alphabet Inc. and research laboratory founded in 2010. DeepMind was acquired by Google in 2014 and became a wholly owned subsidiary of Alphabet Inc, after Google's restru ...
*
Google Brain Google Brain is a deep learning artificial intelligence research team under the umbrella of Google AI, a research division at Google dedicated to artificial intelligence. Formed in 2011, Google Brain combines open-ended machine learning research ...
*
OpenAI OpenAI is an artificial intelligence (AI) research laboratory consisting of the for-profit corporation OpenAI LP and its parent company, the non-profit OpenAI Inc. The company conducts research in the field of AI with the stated goal of promo ...
* Meta AI


Machine learning organizations

Machine learning organizations *
Knowledge Engineering and Machine Learning Group The Knowledge Engineering and Machine Learning group (KEMLg) is a research group belonging to the Technical University of Catalonia (UPC) – BarcelonaTech. It was founded by Prof. Ulises Cortés. The group has been active in the Artificial I ...


Machine learning conferences and workshops

* Artificial Intelligence and Security (AISec) (co-located workshop with CCS) *
Conference on Neural Information Processing Systems The Conference and Workshop on Neural Information Processing Systems (abbreviated as NeurIPS and formerly NIPS) is a machine learning and computational neuroscience conference held every December. The conference is currently a double-track meet ...
(NIPS) * ECML PKDD *
International Conference on Machine Learning The International Conference on Machine Learning (ICML) is the leading international academic conference in machine learning. Along with NeurIPS and ICLR, it is one of the three primary conferences of high impact in machine learning and artific ...
(ICML)
ML4ALL
(Machine Learning For All)


Machine learning publications


Books on machine learning

Books about machine learning


Machine learning journals

* ''
Machine Learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
'' * ''
Journal of Machine Learning Research The ''Journal of Machine Learning Research'' is a peer-reviewed open access scientific journal covering machine learning. It was established in 2000 and the first editor-in-chief was Leslie Kaelbling. The current editors-in-chief are Francis Bac ...
'' (JMLR) * ''
Neural Computation Neural computation is the information processing performed by networks of neurons. Neural computation is affiliated with the philosophical tradition known as Computational theory of mind, also referred to as computationalism, which advances the t ...
''


Persons influential in machine learning

* Alberto Broggi * Andrei Knyazev *
Andrew McCallum Andrew McCallum is a professor in the computer science department at University of Massachusetts Amherst. His primary specialties are in machine learning, natural language processing, information extraction, information integration, and socia ...
*
Andrew Ng Andrew Yan-Tak Ng (; born 1976) is a British-born American computer scientist and technology entrepreneur focusing on machine learning and AI. Ng was a co-founder and head of Google Brain and was the former Chief Scientist at Baidu, buildin ...
* Anuraag Jain * Armin B. Cremers * Ayanna Howard * Barney Pell * Ben Goertzel *
Ben Taskar Ben Taskar (March 3, 1977 – November 18, 2013) was a professor and researcher in the area of machine learning and applications to computational linguistics and computer vision. He was a Magerman Term Associate Professor for Computer and Infor ...
*
Bernhard Schölkopf Bernhard Schölkopf is a German computer scientist (born 20 February 1968) known for his work in machine learning, especially on kernel methods and causality. He is a director at the Max Planck Institute for Intelligent Systems in Tübingen, ...
* Brian D. Ripley * Christopher G. Atkeson * Corinna Cortes * Demis Hassabis * Douglas Lenat *
Eric Xing Eric Poe Xing is an American computer scientist, academic administrator, and entrepreneur. Prior to his appointment as President of MBZUAI, Xing was a professor in the School of Computer Science at Carnegie Mellon University and researcher in mac ...
*
Ernst Dickmanns Ernst Dieter Dickmanns is a German pioneer of dynamic computer vision and of driverless cars. Dickmanns has been a professor at Bundeswehr University Munich (1975–2001), and visiting professor to Caltech and to MIT, teaching courses on "dynami ...
* Geoffrey Hinton – co-inventor of the backpropagation and contrastive divergence training algorithms * Hans-Peter Kriegel * Hartmut Neven * Heikki Mannila *
Ian Goodfellow Ian J. Goodfellow (born ) is a computer scientist, engineer, and executive, most noted for his work on artificial neural networks and deep learning. He was previously employed as a research scientist at Google Brain and director of machine learn ...
– Father of Generative & adversarial networks * Jacek M. Zurada *
Jaime Carbonell Jaime Guillermo Carbonell (July 29, 1953 – February 28, 2020) was a computer scientist who made seminal contributions to the development of natural language processing tools and technologies. His extensive research in machine translation result ...
* Jeremy Slovak *
Jerome H. Friedman Jerome Harold Friedman (born December 29, 1939) is an American statistician, consultant and Professor of Statistics at Stanford University, known for his contributions in the field of statistics and data mining.
*
John D. Lafferty John D. Lafferty is an American scientist, Professor at Yale University and leading researcher in machine learning. He is best known for proposing the Conditional Random Fields with Andrew McCallum and Fernando C.N. Pereira. Biography In 2017, ...
* John Platt – invented SMO and Platt scaling * Julie Beth Lovins *
Jürgen Schmidhuber Jürgen Schmidhuber (born 17 January 1963) is a German computer scientist most noted for his work in the field of artificial intelligence, deep learning and artificial neural networks. He is a co-director of the Dalle Molle Institute for Artific ...
* Karl Steinbuch *
Katia Sycara Ekaterini Panagiotou Sycara ( el, Κάτια Συκαρά) is a Greek computer scientist. She is an Edward Fredkin Research Professor of Robotics in the Robotics Institute, School of Computer Science at Carnegie Mellon University internationally ...
*
Leo Breiman Leo Breiman (January 27, 1928 – July 5, 2005) was a distinguished statistician at the University of California, Berkeley. He was the recipient of numerous honors and awards, and was a member of the United States National Academy of Sciences ...
– invented bagging and random forests *
Lise Getoor Lise Getoor is a professor in the computer science department, at the University of California, Santa Cruz, and an adjunct professor in the Computer Science Department at the University of Maryland, College Park. Her primary research interests ...
*
Luca Maria Gambardella Luca Maria Gambardella (born 4 January 1962) is an Italian computer scientist and author. He is the former director of the Dalle Molle Institute for Artificial Intelligence Research in Manno, in the Ticino canton of Switzerland. With Marco Do ...
*
Léon Bottou Léon Bottou (born 1965) is a researcher best known for his work in machine learning and data compression. His work presents stochastic gradient descent as a fundamental learning algorithm. He is also one of the main creators of the DjVu image comp ...
* Marcus Hutter * Mehryar Mohri *
Michael Collins Michael Collins or Mike Collins most commonly refers to: * Michael Collins (Irish leader) (1890–1922), Irish revolutionary leader, soldier, and politician * Michael Collins (astronaut) (1930–2021), American astronaut, member of Apollo 11 and ...
*
Michael I. Jordan Michael Irwin Jordan (born February 25, 1956) is an American scientist, professor at the University of California, Berkeley and researcher in machine learning, statistics, and artificial intelligence. Jordan was elected a member of the Nat ...
* Michael L. Littman *
Nando de Freitas Nando de Freitas is a researcher in the field of machine learning, and in particular in the subfields of neural networks, Bayesian inference and Bayesian optimization, and deep learning. Biography De Freitas was born in Zimbabwe. He did his un ...
* Ofer Dekel *
Oren Etzioni Oren Etzioni (born 1964) is an American entrepreneur, Professor Emeritus of computer science, and founding CEO of the Allen Institute for Artificial Intelligence (AI2). On June 15, 2022, he announced that he will step down as CEO of AI2 effective ...
*
Pedro Domingos Pedro Domingos is a Professor Emeritus of computer science and engineering at the University of Washington. He is a researcher in machine learning known for Markov logic network enabling uncertain inference. Education Domingos received an un ...
* Peter Flach * Pierre Baldi * Pushmeet Kohli *
Ray Kurzweil Raymond Kurzweil ( ; born February 12, 1948) is an American computer scientist, author, inventor, and futurist. He is involved in fields such as optical character recognition (OCR), text-to-speech synthesis, speech recognition technology, and e ...
* Rayid Ghani * Ross Quinlan * Salvatore J. Stolfo * Sebastian Thrun *
Selmer Bringsjord Selmer Bringsjord (born November 24, 1958) is the chair of the Department of Cognitive Science at Rensselaer Polytechnic Institute and a professor of Computer Science and Cognitive Science. He also holds an appointment in the Lally School of ...
*
Sepp Hochreiter Josef "Sepp" Hochreiter (born 14 February 1967) is a German computer scientist. Since 2018 he has led the Institute for Machine Learning at the Johannes Kepler University of Linz after having led the Institute of Bioinformatics from 2006 to 2018 ...
* Shane Legg *
Stephen Muggleton Stephen H. Muggleton FBCS, FIET, FAAAI, FECCAI, FSB, FREng (born 6 December 1959, son of Louis Muggleton) is Professor of Machine Learning and Head of the Computational Bioinformatics Laboratory at Imperial College London.Steve Omohundro * Tom M. Mitchell * Trevor Hastie *
Vasant Honavar Vasant G. Honavar is an Indian born American computer scientist, and artificial intelligence, machine learning, big data, data science, causal inference, knowledge representation, bioinformatics and health informatics researcher and professor. ...
*
Vladimir Vapnik Vladimir Naumovich Vapnik (russian: Владимир Наумович Вапник; born 6 December 1936) is one of the main developers of the Vapnik–Chervonenkis theory of statistical learning, and the co-inventor of the support-vector machin ...
– co-inventor of the SVM and VC theory *
Yann LeCun Yann André LeCun ( , ; originally spelled Le Cun; born 8 July 1960) is a French computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Professo ...
– invented convolutional neural networks * Yasuo Matsuyama *
Yoshua Bengio Yoshua Bengio (born March 5, 1964) is a Canadian computer scientist, most noted for his work on artificial neural networks and deep learning. He is a professor at the Department of Computer Science and Operations Research at the Université de ...
* Zoubin Ghahramani


See also

* Outline of artificial intelligence **
Outline of computer vision The following outline is provided as an overview of and topical guide to computer vision: Computer vision – interdisciplinary field that deals with how computers can be made to gain high-level understanding from digital images or videos. ...
* Outline of robotics * Accuracy paradox *
Action model learning Action model learning (sometimes abbreviated action learning) is an area of machine learning concerned with creation and modification of software agent's knowledge about ''effects'' and ''preconditions'' of the ''actions'' that can be executed wi ...
* Activation function *
Activity recognition Activity recognition aims to recognize the actions and goals of one or more agents from a series of observations on the agents' actions and the environmental conditions. Since the 1980s, this research field has captured the attention of several c ...
* ADALINE *
Adaptive neuro fuzzy inference system An adaptive neuro-fuzzy inference system or adaptive network-based fuzzy inference system (ANFIS) is a kind of artificial neural network that is based on Takagi–Sugeno fuzzy inference system. The technique was developed in the early 1990s. Since ...
* Adaptive resonance theory * Additive smoothing * Adjusted mutual information *
AIVA AIVA (Artificial Intelligence Virtual Artist) is an algorithmic composition, electronic composer recognized by the SACEM. Description Created in February 2016, AIVA specializes in Classical music, classical and symphonic music composition. It b ...
* AIXI * AlchemyAPI *
AlexNet AlexNet is the name of a convolutional neural network (CNN) architecture, designed by Alex Krizhevsky in collaboration with Ilya Sutskever and Geoffrey Hinton, who was Krizhevsky's Ph.D. advisor. AlexNet competed in the ImageNet Large Scale Vi ...
* Algorithm selection *
Algorithmic inference Algorithmic inference gathers new developments in the statistical inference methods made feasible by the powerful computing devices widely available to any data analyst. Cornerstones in this field are computational learning theory, granular computin ...
* Algorithmic learning theory *
AlphaGo AlphaGo is a computer program that plays the board game Go. It was developed by DeepMind Technologies a subsidiary of Google (now Alphabet Inc.). Subsequent versions of AlphaGo became increasingly powerful, including a version that competed u ...
* AlphaGo Zero * Alternating decision tree * Apprenticeship learning *
Causal Markov condition The Markov condition, sometimes called the Markov assumption, is an assumption made in Bayesian probability theory, that every node in a Bayesian network is conditionally independent of its nondescendants, given its parents. Stated loosely, it is as ...
*
Competitive learning Competitive learning is a form of unsupervised learning in artificial neural networks, in which nodes compete for the right to respond to a subset of the input data. A variant of Hebbian learning, competitive learning works by increasing the special ...
* Concept learning *
Decision tree learning Decision tree learning is a supervised learning approach used in statistics, data mining and machine learning. In this formalism, a classification or regression decision tree is used as a predictive model to draw conclusions about a set of ...
* Differentiable programming * Distribution learning theory *
Eager learning In artificial intelligence, eager learning is a learning method in which the system tries to construct a general, input-independent target function during training of the system, as opposed to lazy learning, where generalization beyond the training ...
* End-to-end reinforcement learning * Error tolerance (PAC learning) * Explanation-based learning * Feature *
GloVe A glove is a garment covering the hand. Gloves usually have separate sheaths or openings for each finger and the thumb. If there is an opening but no (or a short) covering sheath for each finger they are called fingerless gloves. Fingerless g ...
* Hyperparameter *
Inferential theory of learning Inferential Theory of Learning (ITL) is an area of machine learning which describes inferential processes performed by learning agents. ITL has been continuously developed by Ryszard S. Michalski, starting in the 1980s. The first known publication ...
*
Learning automata A learning automaton is one type of machine learning algorithm studied since 1970s. Learning automata select their current action based on past experiences from the environment. It will fall into the range of reinforcement learning if the environme ...
* Learning classifier system * Learning rule * Learning with errors * M-Theory (learning framework) *
Machine learning control Machine learning control (MLC) is a subfield of machine learning, intelligent control and control theory which solves optimal control problems with methods of machine learning. Key applications are complex nonlinear systems for which linear cont ...
* Machine learning in bioinformatics *
Margin Margin may refer to: Physical or graphical edges * Margin (typography), the white space that surrounds the content of a page *Continental margin, the zone of the ocean floor that separates the thin oceanic crust from thick continental crust *Leaf ...
* Markov chain geostatistics *
Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain ...
(MCMC) * Markov information source *
Markov logic network A Markov logic network (MLN) is a probabilistic logic which applies the ideas of a Markov network to first-order logic, enabling uncertain inference. Markov logic networks generalize first-order logic, in the sense that, in a certain limit, all u ...
*
Markov model In probability theory, a Markov model is a stochastic model used to model pseudo-randomly changing systems. It is assumed that future states depend only on the current state, not on the events that occurred before it (that is, it assumes the Mark ...
* Markov random field *
Markovian discrimination Within the probability theory Markov model, Markovian discrimination in spam filtering is a method used in CRM114 and other spam filters to model the statistical behaviors of spam and nonspam more accurately than in simple Bayesian methods. A s ...
* Maximum-entropy Markov model *
Multi-armed bandit In probability theory and machine learning, the multi-armed bandit problem (sometimes called the ''K''- or ''N''-armed bandit problem) is a problem in which a fixed limited set of resources must be allocated between competing (alternative) choices ...
*
Multi-task learning Multi-task learning (MTL) is a subfield of machine learning in which multiple learning tasks are solved at the same time, while exploiting commonalities and differences across tasks. This can result in improved learning efficiency and prediction ac ...
* Multilinear subspace learning * Multimodal learning * Multiple instance learning * Multiple-instance learning *
Never-Ending Language Learning Never-Ending Language Learning system (NELL) is a semantic machine learning system developed by a research team at Carnegie Mellon University, and supported by grants from DARPA, Google, NSF, and CNPq with portions of the system running on a superc ...
* Offline learning *
Parity learning Parity learning is a problem in machine learning. An algorithm that solves this problem must find a function ''ƒ'', given some samples (''x'', ''ƒ''(''x'')) and the assurance that ''ƒ'' computes the parity of bits at some fixed locations. T ...
*
Population-based incremental learning In computer science and machine learning, population-based incremental learning (PBIL) is an optimization algorithm, and an estimation of distribution algorithm. This is a type of genetic algorithm where the genotype of an entire population (probab ...
* Predictive learning * Preference learning * Proactive learning *
Proximal gradient methods for learning Proximal gradient (forward backward splitting) methods for learning is an area of research in optimization and statistical learning theory which studies algorithms for a general class of convex regularization problems where the regularization pena ...
* Semantic analysis *
Similarity learning Similarity learning is an area of supervised machine learning in artificial intelligence. It is closely related to regression and classification, but the goal is to learn a similarity function that measures how similar or related two objects are. ...
* Sparse dictionary learning * Stability (learning theory) *
Statistical learning theory Statistical learning theory is a framework for machine learning drawing from the fields of statistics and functional analysis. Statistical learning theory deals with the statistical inference problem of finding a predictive function based on dat ...
*
Statistical relational learning Statistical relational learning (SRL) is a subdiscipline of artificial intelligence and machine learning that is concerned with domain models that exhibit both uncertainty (which can be dealt with using statistical methods) and complex, relational ...
*
Tanagra Tanagra ( el, Τανάγρα) is a town and a municipality north of Athens in Boeotia, Greece. The seat of the municipality is the town Schimatari. It is not far from Thebes, and it was noted in antiquity for the figurines named after it. The T ...
*
Transfer learning Transfer learning (TL) is a research problem in machine learning (ML) that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem. For example, knowledge gained while learning to recognize ...
* Variable-order Markov model *
Version space learning Version space learning is a logical approach to machine learning, specifically binary classification. Version space learning algorithms search a predefined space of hypotheses, viewed as a set of logical sentences. Formally, the hypothesis space i ...
*
Waffles A waffle is a dish made from leavened batter or dough that is cooked between two plates that are patterned to give a characteristic size, shape, and surface impression. There are many variations based on the type of waffle iron and recipe used ...
* Weka *
Loss function In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cos ...
**
Loss functions for classification In machine learning and mathematical optimization, loss functions for classification are computationally feasible loss functions representing the price paid for inaccuracy of predictions in classification problems (problems of identifying whi ...
**
Mean squared error In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference between ...
(MSE) ** Mean squared prediction error (MSPE) **
Taguchi loss function The Taguchi loss function is graphical depiction of loss developed by the Japanese business statistician Genichi Taguchi to describe a phenomenon affecting the value of products produced by a company. Praised by Dr. W. Edwards Deming (the business ...
* Low-energy adaptive clustering hierarchy


Other

*
Anne O'Tate Anne O'Tate is a free, web-based application that analyses sets of records identified on PubMed, the bibliographic database of articles from over 5,500 biomedical journals worldwide. While PubMed has its own wide range of search options to identi ...
* Ant colony optimization algorithms * Anthony Levandowski * Anti-unification (computer science) * Apache Flume * Apache Giraph *
Apache Mahout Apache Mahout is a project of the Apache Software Foundation to produce free implementations of distributed or otherwise scalable machine learning algorithms focused primarily on linear algebra. In the past, many of the implementations use the ...
* Apache SINGA *
Apache Spark Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming clusters with implicit data parallelism and fault tolerance. Originally developed at the University of Califor ...
* Apache SystemML * Aphelion (software) * Arabic Speech Corpus * Archetypal analysis * Arthur Zimek * Artificial ants *
Artificial bee colony algorithm In computer science and operations research, the artificial bee colony algorithm (ABC) is an optimization algorithm based on the intelligent foraging behaviour of honey bee swarm, proposed by Derviş Karaboğa (Erciyes University) in 2005. Alg ...
*
Artificial development Artificial development, also known as artificial embryogeny or machine intelligence or computational development, is an area of computer science and engineering concerned with computational models motivated by genotype–phenotype mappings in biol ...
*
Artificial immune system In artificial intelligence, artificial immune systems (AIS) are a class of computationally intelligent, rule-based machine learning systems inspired by the principles and processes of the vertebrate immune system. The algorithms are typically mode ...
*
Astrostatistics Astrostatistics is a discipline which spans astrophysics, statistical analysis and data mining. It is used to process the vast amount of data produced by automated scanning of the cosmos, to characterize complex datasets, and to link astronomica ...
* Averaged one-dependence estimators * Bag-of-words model * Balanced clustering *
Ball tree In computer science, a ball tree, balltree or metric tree, is a space partitioning data structure for organizing points in a multi-dimensional space. A ball tree partitions data points into a nested set of balls. The resulting data structure ...
* Base rate * Bat algorithm *
Baum–Welch algorithm In electrical engineering, statistical computing and bioinformatics, the Baum–Welch algorithm is a special case of the expectation–maximization algorithm used to find the unknown parameters of a hidden Markov model (HMM). It makes use of the ...
* Bayesian hierarchical modeling * Bayesian interpretation of kernel regularization * Bayesian optimization * Bayesian structural time series * Bees algorithm * Behavioral clustering *
Bernoulli scheme In mathematics, the Bernoulli scheme or Bernoulli shift is a generalization of the Bernoulli process to more than two possible outcomes. Bernoulli schemes appear naturally in symbolic dynamics, and are thus important in the study of dynamical sy ...
*
Bias–variance tradeoff In statistics and machine learning, the bias–variance tradeoff is the property of a model that the variance of the parameter estimated across samples can be reduced by increasing the bias in the estimated parameters. The bias–variance di ...
* Biclustering * BigML *
Binary classification Binary classification is the task of classifying the elements of a set into two groups (each called ''class'') on the basis of a classification rule. Typical binary classification problems include: * Medical testing to determine if a patient has c ...
* Bing Predicts *
Bio-inspired computing Bio-inspired computing, short for biologically inspired computing, is a field of study which seeks to solve computer science problems using models of biology. It relates to connectionism, social behavior, and emergence. Within computer science, b ...
* Biogeography-based optimization *
Biplot Biplots are a type of exploratory graph used in statistics, a generalization of the simple two-variable scatterplot. A biplot overlays a ''score plot'' with a ''loading plot''. A biplot allows information on both samples and variables of a d ...
* Bondy's theorem * Bongard problem * Bradley–Terry model * BrownBoost * Brown clustering *
Burst error In telecommunication, a burst error or error burst is a contiguous sequence of symbols, received over a communication channel, such that the first and last symbols are in error and there exists no contiguous subsequence of ''m'' correctly receive ...
*
CBCL (MIT) The Center for Biological & Computational Learning is a research lab at the Massachusetts Institute of Technology. CBCL was established in 1992 with support from the National Science Foundation. It is based in the Department of Brain & Cognitive Sc ...
* CIML community portal *
CMA-ES Covariance matrix adaptation evolution strategy (CMA-ES) is a particular kind of strategy for numerical optimization. Evolution strategies (ES) are stochastic, derivative-free methods for numerical optimization of non-linear or non-convex continu ...
* CURE data clustering algorithm * Cache language model *
Calibration (statistics) There are two main uses of the term calibration in statistics that denote special types of statistical inference problems. "Calibration" can mean :*a reverse process to regression, where instead of a future dependent variable being predicted from ...
* Canonical correspondence analysis * Canopy clustering algorithm *
Cascading classifiers Cascading is a particular case of ensemble learning based on the concatenation of several classifiers, using all information collected from the output from a given classifier as additional information for the next classifier in the cascade. Unl ...
* Category utility * CellCognition * Cellular evolutionary algorithm * Chi-square automatic interaction detection *
Chromosome (genetic algorithm) In genetic algorithms, a chromosome (also sometimes called a genotype) is a set of parameters which define a proposed solution to the problem that the genetic algorithm is trying to solve. The set of all solutions is known as the ''population''. T ...
* Classifier chains *
Cleverbot Cleverbot is a chatterbot web application that uses machine learning techniques to have conversations with humans. It was created by British AI scientist Rollo Carpenter. It was preceded by Jabberwacky, a chatbot project that began in 1988 a ...
*
Clonal selection algorithm In artificial immune systems, clonal selection algorithms are a class of algorithms inspired by the clonal selection theory of acquired immunity that explains how B and T lymphocytes improve their response to antigens over time called affinity mat ...
* Cluster-weighted modeling *
Clustering high-dimensional data Clustering high-dimensional data is the cluster analysis of data with anywhere from a few dozen to many thousands of dimensions. Such high-dimensional spaces of data are often encountered in areas such as medicine, where DNA microarray technology ca ...
*
Clustering illusion The clustering illusion is the tendency to erroneously consider the inevitable "streaks" or "clusters" arising in small samples from random distributions to be non-random. The illusion is caused by a human tendency to underpredict the amount of v ...
* CoBoosting * Cobweb (clustering) *
Cognitive computer A cognitive computer is a computer that hardwires artificial intelligence and machine-learning algorithms into an integrated circuit (printed circuit board) that closely reproduces the behavior of the human brain. It generally adopts a neuromorphic ...
* Cognitive robotics *
Collostructional analysis Collostructional analysis is a family of methods developed by (in alphabetical order) Stefan Th. Gries (University of California, Santa Barbara) and Anatol Stefanowitsch (Free University of Berlin). Collostructional analysis aims at measuring the d ...
* Common-method variance *
Complete-linkage clustering Complete-linkage clustering is one of several methods of agglomerative hierarchical clustering. At the beginning of the process, each element is in a cluster of its own. The clusters are then sequentially combined into larger clusters until all ...
* Computer-automated design *
Concept class In computational learning theory in mathematics, a concept over a domain ''X'' is a total Boolean function over ''X''. A concept class is a class of concepts. Concept classes are a subject of computational learning theory. Concept class terminolog ...
*
Concept drift In predictive analytics and machine learning, concept drift means that the statistical properties of the target variable, which the model is trying to predict, change over time in unforeseen ways. This causes problems because the predictions become ...
*
Conference on Artificial General Intelligence The Conference on Artificial General Intelligence is a meeting of researchers in the field of Artificial General Intelligence organized by thAGI Societyand held annually since 2008. The conference was initiated by the 2006 Bethesda Artificial Gen ...
* Conference on Knowledge Discovery and Data Mining * Confirmatory factor analysis * Confusion matrix *
Congruence coefficient In multivariate statistics, the congruence coefficient is an index of the similarity between factors that have been derived in a factor analysis. It was introduced in 1948 by Cyril Burt who referred to it as ''unadjusted correlation''. It is also c ...
* Connect (computer system) * Consensus clustering * Constrained clustering *
Constrained conditional model A constrained conditional model (CCM) is a machine learning and inference framework that augments the learning of conditional (probabilistic or discriminative) models with declarative constraints. The constraint can be used as a way to incorporate e ...
* Constructive cooperative coevolution * Correlation clustering *
Correspondence analysis Correspondence analysis (CA) is a multivariate statistical technique proposed by Herman Otto Hartley (Hirschfeld) and later developed by Jean-Paul Benzécri. It is conceptually similar to principal component analysis, but applies to categorical rath ...
*
Cortica Headquartered in Tel Aviv Cortica utilizes unsupervised learning methods to recognize and analyze digital images and video. The technology developed by the Cortica team is based on research of the function of the human brain. Company Founding Co ...
* Coupled pattern learner * Cross-entropy method *
Cross-validation (statistics) Cross-validation, sometimes called rotation estimation or out-of-sample testing, is any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set. Cross-va ...
* Crossover (genetic algorithm) *
Cuckoo search In operations research, cuckoo search is an optimization algorithm developed by Xin-She Yang and Suash Deb in 2009. It was inspired by the obligate brood parasitism of some cuckoo species by laying their eggs in the nests of host birds of other ...
* Cultural algorithm * Cultural consensus theory * Curse of dimensionality *
DADiSP DADiSP (Data Analysis and Display, pronounced day-disp) is a numerical computing environment developed by DSP Development Corporation which allows one to display and manipulate data series, matrices and images with an interface similar to a s ...
* DARPA LAGR Program * Darkforest *
Dartmouth workshop The Dartmouth Summer Research Project on Artificial Intelligence was a 1956 summer workshop widely consideredKline, Ronald R., Cybernetics, Automata Studies and the Dartmouth Conference on Artificial Intelligence, IEEE Annals of the History of ...
*
DarwinTunes DarwinTunes was a research project into the use of natural selection to create music led by Bob MacCallum and Armand Leroi, scientists at Imperial College London. The project asks volunteers on the Internet to listen to automatically generated so ...
* Data Mining Extensions *
Data exploration Data exploration is an approach similar to initial data analysis, whereby a data analyst uses visual exploration to understand what is in a dataset and the characteristics of the data, rather than through traditional data management systems.
*
Data pre-processing Data preprocessing can refer to manipulation or dropping of data before it is used in order to ensure or enhance performance, and is an important step in the data mining process. The phrase "garbage in, garbage out" is particularly applicable to ...
*
Data stream clustering In computer science, data stream clustering is defined as the clustering of data that arrive continuously such as telephone records, multimedia data, financial transactions etc. Data stream clustering is usually studied as a streaming algorithm an ...
* Dataiku *
Davies–Bouldin index The Davies–Bouldin index (DBI), introduced by David L. Davies and Donald W. Bouldin in 1979, is a metric for evaluating clustering algorithms. This is an internal evaluation scheme, where the validation of how well the clustering has been ...
*
Decision boundary __NOTOC__ In a statistical-classification problem with two classes, a decision boundary or decision surface is a hypersurface that partitions the underlying vector space into two sets, one for each class. The classifier will classify all the point ...
* Decision list *
Decision tree model In computational complexity the decision tree model is the model of computation in which an algorithm is considered to be basically a decision tree, i.e., a sequence of ''queries'' or ''tests'' that are done adaptively, so the outcome of the prev ...
*
Deductive classifier A deductive classifier is a type of artificial intelligence inference engine. It takes as input a set of declarations in a frame language about a domain such as medical research or molecular biology. For example, the names of classes, sub-classes ...
* DeepArt *
DeepDream DeepDream is a computer vision program created by Google engineer Alexander Mordvintsev that uses a convolutional neural network to find and enhance patterns in images via algorithmic pareidolia, thus creating a dream-like appearance reminiscent ...
* Deep Web Technologies * Defining length *
Dendrogram A dendrogram is a diagram representing a tree. This diagrammatic representation is frequently used in different contexts: * in hierarchical clustering, it illustrates the arrangement of the clusters produced by the corresponding analyses. ...
* Dependability state model * Detailed balance *
Determining the number of clusters in a data set Determining the number of clusters in a data set, a quantity often labelled ''k'' as in the ''k''-means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actually solving the clustering problem. For a ...
* Detrended correspondence analysis * Developmental robotics * Diffbot * Differential evolution * Discrete phase-type distribution * Discriminative model * Dissociated press * Distributed R * Dlib * Document classification * Documenting Hate * Domain adaptation * Doubly stochastic model * Dual-phase evolution * Dunn index * Dynamic Bayesian network * Dynamic Markov compression * Dynamic topic model * Dynamic unobserved effects model * EDLUT * ELKI * Edge recombination operator * Effective fitness * Elastic map * Elastic matching * Elbow method (clustering) * Emergent (software) * Encog * Entropy rate * Erkki Oja * Eurisko * European Conference on Artificial Intelligence * Evaluation of binary classifiers * Evolution strategy * Evolution window * Evolutionary Algorithm for Landmark Detection * Evolutionary algorithm * Evolutionary art * Evolutionary music * Evolutionary programming * Evolvability (computer science) * Evolved antenna * Evolver (software) * Evolving classification function * Expectation propagation * Exploratory factor analysis * F1 score * FLAME clustering * Factor analysis of mixed data * Factor graph * Factor regression model * Factored language model * Farthest-first traversal * Fast-and-frugal trees * Feature Selection Toolbox * Feature hashing * Feature scaling * Feature vector * Firefly algorithm * First-difference estimator * First-order inductive learner * Fish School Search * Fisher kernel * Fitness approximation * Fitness function * Fitness proportionate selection * Fluentd * Folding@home * Formal concept analysis * Forward algorithm * Fowlkes–Mallows index * Frederick Jelinek * Frrole * Functional principal component analysis * GATTO * GLIMMER * Gary Bryce Fogel * Gaussian adaptation * Gaussian process * Gaussian process emulator * Gene prediction * General Architecture for Text Engineering * Generalization error * Generalized canonical correlation * Generalized filtering * Generalized iterative scaling * Generalized multidimensional scaling * Generative adversarial network * Generative model * Genetic algorithm * Genetic algorithm scheduling * Genetic algorithms in economics * Genetic fuzzy systems * Genetic memory (computer science) * Genetic operator * Genetic programming * Genetic representation * Geographical cluster * Gesture Description Language * Geworkbench * Glossary of artificial intelligence * Glottochronology * Golem (ILP) * Google matrix * Grafting (decision trees) * Gramian matrix * Grammatical evolution * Granular computing * GraphLab * Graph kernel * Gremlin (programming language) * Growth function * HUMANT (HUManoid ANT) algorithm * Hammersley–Clifford theorem * Harmony search * Hebbian theory * Hidden Markov random field * Hidden semi-Markov model *
Hierarchical hidden Markov model The hierarchical hidden Markov model (HHMM) is a statistical model derived from the hidden Markov model (HMM). In an HHMM, each state is considered to be a self-contained probabilistic model. More precisely, each state of the HHMM is itself an HHMM ...
* Higher-order factor analysis * Highway network * Hinge loss * Holland's schema theorem * Hopkins statistic * Hoshen–Kopelman algorithm * Huber loss * IRCF360 *
Ian Goodfellow Ian J. Goodfellow (born ) is a computer scientist, engineer, and executive, most noted for his work on artificial neural networks and deep learning. He was previously employed as a research scientist at Google Brain and director of machine learn ...
* Ilastik * Ilya Sutskever * Immunocomputing * Imperialist competitive algorithm * Inauthentic text * Incremental decision tree * Induction of regular languages * Inductive bias * Inductive probability * Inductive programming * Influence diagram * Information Harvesting * Information fuzzy networks * Information gain in decision trees * Information gain ratio * Inheritance (genetic algorithm) * Instance selection * Intel RealSense * Interacting particle system * Interactive machine translation * International Joint Conference on Artificial Intelligence * International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics * International Semantic Web Conference * Iris flower data set * Island algorithm * Isotropic position * Item response theory * Iterative Viterbi decoding * JOONE * Jabberwacky * Jaccard index * Jackknife variance estimates for random forest * Java Grammatical Evolution * Joseph Nechvatal * Jubatus * Julia (programming language) * Junction tree algorithm * K-SVD * K-means++ * K-medians clustering * K-medoids * KNIME * KXEN Inc. * K q-flats * Kaggle * Kalman filter * Katz's back-off model * Kernel adaptive filter * Kernel density estimation * Kernel eigenvoice * Kernel embedding of distributions * Kernel method * Kernel perceptron * Kernel random forest * Kinect * Klaus-Robert Müller * Kneser–Ney smoothing * Knowledge Vault * Knowledge integration * LIBSVM * LPBoost * Labeled data * LanguageWare * Language identification in the limit * Language model * Large margin nearest neighbor * Latent Dirichlet allocation * Latent class model * Latent semantic analysis * Latent variable * Latent variable model * Lattice Miner * Layered hidden Markov model * Learnable function class * Least squares support vector machine * Leave-one-out error * Leslie P. Kaelbling * Linear genetic programming * Linear predictor function * Linear separability * Lingyun Gu * Linkurious * Lior Ron (business executive) * List of genetic algorithm applications * List of metaphor-based metaheuristics * List of text mining software * Local case-control sampling * Local independence * Local tangent space alignment * Locality-sensitive hashing * Log-linear model * Logistic model tree * Low-rank approximation * Low-rank matrix approximations * MATLAB * MIMIC (immunology) * MXNet * Mallet (software project) * Manifold regularization * Margin-infused relaxed algorithm * Margin classifier * Mark V. Shaney * Massive Online Analysis * Matrix regularization * Matthews correlation coefficient * Mean shift *
Mean squared error In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference between ...
* Mean squared prediction error * Measurement invariance * Medoid * MeeMix * Melomics * Memetic algorithm * Meta-optimization * Mexican International Conference on Artificial Intelligence * Michael Kearns (computer scientist) * MinHash * Mixture model * Mlpy * Models of DNA evolution * Moral graph * Mountain car problem * Movidius *
Multi-armed bandit In probability theory and machine learning, the multi-armed bandit problem (sometimes called the ''K''- or ''N''-armed bandit problem) is a problem in which a fixed limited set of resources must be allocated between competing (alternative) choices ...
*
Multi-label classification In machine learning, multi-label classification or multi-output classification is a variant of the classification problem where multiple nonexclusive labels may be assigned to each instance. Multi-label classification is a generalization of mult ...
* Multi expression programming * Multiclass classification * Multidimensional analysis * Multifactor dimensionality reduction * Multilinear principal component analysis * Multiple correspondence analysis * Multiple discriminant analysis * Multiple factor analysis * Multiple sequence alignment * Multiplicative weight update method * Multispectral pattern recognition * Mutation (genetic algorithm) * MysteryVibe * N-gram * NOMINATE (scaling method) * Native-language identification * Natural Language Toolkit * Natural evolution strategy * Nearest-neighbor chain algorithm * Nearest centroid classifier * Nearest neighbor search * Neighbor joining * Nest Labs * NetMiner * NetOwl * Neural Designer * Neural Engineering Object * Neural Lab * Neural modeling fields * Neural network software * NeuroSolutions * Neuro Laboratory * Neuroevolution * Neuroph * Niki.ai * Noisy channel model * Noisy text analytics * Nonlinear dimensionality reduction * Novelty detection * Nuisance variable * One-class classification * Onnx * OpenNLP * Optimal discriminant analysis * Oracle Data Mining * Orange (software) * Ordination (statistics) * Overfitting * PROGOL * PSIPRED * Pachinko allocation * PageRank * Parallel metaheuristic * Parity benchmark * Part-of-speech tagging * Particle swarm optimization * Path dependence * Pattern language (formal languages) * Peltarion Synapse * Perplexity * Persian Speech Corpus * Picas (app) * Pietro Perona * Pipeline Pilot * Piranha (software) * Pitman–Yor process * Plate notation * Polynomial kernel * Pop music automation * Population process * Portable Format for Analytics * Predictive Model Markup Language * Predictive state representation * Preference regression * Premature convergence * Principal geodesic analysis * Prior knowledge for pattern recognition * Prisma (app) * Probabilistic Action Cores * Probabilistic context-free grammar * Probabilistic latent semantic analysis * Probabilistic soft logic * Probability matching * Probit model * Product of experts * Programming with Big Data in R * Proper generalized decomposition * Pruning (decision trees) * Pushpak Bhattacharyya * Q methodology * Qloo * Quality control and genetic algorithms * Quantum Artificial Intelligence Lab * Queueing theory * Quick, Draw! * R (programming language) * Rada Mihalcea * Rademacher complexity * Radial basis function kernel * Rand index * Random indexing * Random projection * Random subspace method * Ranking SVM * RapidMiner * Rattle GUI * Raymond Cattell * Reasoning system * Regularization perspectives on support vector machines * Relational data mining * Relationship square * Relevance vector machine * Relief (feature selection) * Renjin * Repertory grid * Representer theorem * Reward-based selection * Richard Zemel * Right to explanation * RoboEarth * Robust principal component analysis * RuleML Symposium * Rule induction * Rules extraction system family * SAS (software) * SNNS * SPSS Modeler * SUBCLU * Sample complexity * Sample exclusion dimension * Santa Fe Trail problem * Savi Technology * Schema (genetic algorithms) * Search-based software engineering * Selection (genetic algorithm) * Self-Service Semantic Suite * Semantic folding * Semantic mapping (statistics) * Semidefinite embedding * Sense Networks * Sensorium Project * Sequence labeling * Sequential minimal optimization * Shattered set * Shogun (toolbox) * Silhouette (clustering) * SimHash * SimRank * Similarity measure * Simple matching coefficient * Simultaneous localization and mapping * Sinkov statistic * Sliced inverse regression * Snakes and Ladders * Soft independent modelling of class analogies * Soft output Viterbi algorithm * Solomonoff's theory of inductive inference * SolveIT Software * Spectral clustering * Spike-and-slab variable selection * Statistical machine translation * Statistical parsing * Statistical semantics * Stefano Soatto * Stephen Wolfram * Stochastic block model * Stochastic cellular automaton * Stochastic diffusion search * Stochastic grammar * Stochastic matrix * Stochastic universal sampling * Stress majorization * String kernel * Structural equation modeling * Structural risk minimization * Structured sparsity regularization * Structured support vector machine * Subclass reachability * Sufficient dimension reduction * Sukhotin's algorithm * Sum of absolute differences * Sum of absolute transformed differences * Swarm intelligence * Switching Kalman filter * Symbolic regression * Synchronous context-free grammar * Syntactic pattern recognition * TD-Gammon * TIMIT * Teaching dimension * Teuvo Kohonen * Textual case-based reasoning * Theory of conjoint measurement * Thomas G. Dietterich * Thurstonian model * Topic model * Tournament selection * Training, test, and validation sets * Transiogram * Trax Image Recognition * Trigram tagger * Truncation selection * Tucker decomposition * UIMA * UPGMA * Ugly duckling theorem * Uncertain data * Uniform convergence in probability * Unique negative dimension * Universal portfolio algorithm * User behavior analytics * VC dimension * VIGRA * Validation set * Vapnik–Chervonenkis theory * Variable-order Bayesian network * Variable kernel density estimation * Variable rules analysis * Variational message passing * Varimax rotation * Vector quantization * Vicarious (company) * Viterbi algorithm * Vowpal Wabbit * WACA clustering algorithm * WPGMA * Ward's method * Weasel program * Whitening transformation * Winnow (algorithm) * Win–stay, lose–switch * Witness set * Wolfram Language * Wolfram Mathematica * Writer invariant * Xgboost * Yooreeka * Zeroth (software)


Further reading

* Trevor Hastie, Robert Tibshirani and
Jerome H. Friedman Jerome Harold Friedman (born December 29, 1939) is an American statistician, consultant and Professor of Statistics at Stanford University, known for his contributions in the field of statistics and data mining.
(2001).
The Elements of Statistical Learning
', Springer. . *
Pedro Domingos Pedro Domingos is a Professor Emeritus of computer science and engineering at the University of Washington. He is a researcher in machine learning known for Markov logic network enabling uncertain inference. Education Domingos received an un ...
(September 2015), The Master Algorithm, Basic Books, * Mehryar Mohri, Afshin Rostamizadeh, Ameet Talwalkar (2012).
Foundations of Machine Learning
', The MIT Press. . * Ian H. Witten and Eibe Frank (2011). ''Data Mining: Practical machine learning tools and techniques'' Morgan Kaufmann, 664pp., . * David J. C. MacKay.
Information Theory, Inference, and Learning Algorithms
' Cambridge: Cambridge University Press, 2003. * Richard O. Duda, Peter E. Hart, David G. Stork (2001) ''Pattern classification'' (2nd edition), Wiley, New York, . * Christopher Bishop (1995). ''Neural Networks for Pattern Recognition'', Oxford University Press. . *
Vladimir Vapnik Vladimir Naumovich Vapnik (russian: Владимир Наумович Вапник; born 6 December 1936) is one of the main developers of the Vapnik–Chervonenkis theory of statistical learning, and the co-inventor of the support-vector machin ...
(1998). ''Statistical Learning Theory''. Wiley-Interscience, . * Ray Solomonoff, ''An Inductive Inference Machine'', IRE Convention Record, Section on Information Theory, Part 2, pp., 56–62, 1957. * Ray Solomonoff,
An Inductive Inference Machine
A privately circulated report from the 1956 Dartmouth Conferences, Dartmouth Summer Research Conference on AI.


References


External links


Data Science: Data to Insights from MIT (machine learning)
* Popular online course by
Andrew Ng Andrew Yan-Tak Ng (; born 1976) is a British-born American computer scientist and technology entrepreneur focusing on machine learning and AI. Ng was a co-founder and head of Google Brain and was the former Chief Scientist at Baidu, buildin ...
, a
Coursera
It uses GNU Octave. The course is a free version of Stanford University's actual course taught by Ng, see.stanford.edu/Course/CS229 available for free].
mloss
is an academic database of open-source machine learning software. {{Outline footer Outlines of applied sciences, Machine learning Wikipedia outlines, Machine learning Computing-related lists Machine learning, * Data mining, Machine learning