The following outline is provided as an overview of and topical guide to machine learning.

Machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

is a subfield of

soft computing Soft computing is a set of algorithms, including neural networks, fuzzy logic, and evolutionary algorithms. These algorithms are tolerant of imprecision, uncertainty, partial truth and approximation. It is contrasted with hard computing: al ...

within

computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...

that evolved from the study of

pattern recognition Pattern recognition is the automated recognition of patterns and regularities in data. It has applications in statistical data analysis, signal processing, image analysis, information retrieval, bioinformatics, data compression, computer graphics ...

and computational learning theory in

artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech ...

.http://www.britannica.com/EBchecked/topic/1116194/machine-learning In 1959, Arthur Samuel defined machine learning as a "field of study that gives computers the ability to learn without being explicitly programmed". Machine learning explores the study and construction of

algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...

s that can

learn Learning is the process of acquiring new understanding, knowledge, behaviors, skills, values, attitudes, and preferences. The ability to learn is possessed by humans, animals, and some machines; there is also evidence for some kind of l ...

from and make predictions on

data In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpret ...

. Such algorithms operate by building a model from an example

training set In machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from ...

of input observations in order to make data-driven predictions or decisions expressed as outputs, rather than following strictly static program instructions.

What ''type'' of thing is machine learning?

* An

academic discipline An academy (Attic Greek: Ἀκαδήμεια; Koine Greek Ἀκαδημία) is an institution of secondary education, secondary or tertiary education, tertiary higher education, higher learning (and generally also research or honorary membershi ...

* A branch of

science Science is a systematic endeavor that builds and organizes knowledge in the form of testable explanations and predictions about the universe. Science may be as old as the human species, and some of the earliest archeological evidence ...

** An

applied science Applied science is the use of the scientific method and knowledge obtained via conclusions from the method to attain practical goals. It includes a broad range of disciplines such as engineering and medicine. Applied science is often contrasted ...

*** A subfield of

**** A branch of

**** A subfield of soft computing *** Application of

statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...

Branches of machine learning

Subfields of machine learning

* Computational learning theory – studying the design and analysis of

machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

algorithms. * Grammar induction * Meta-learning

Cross-disciplinary fields involving machine learning

* Adversarial machine learning *

Predictive analytics Predictive analytics encompasses a variety of statistical techniques from data mining, predictive modeling, and machine learning that analyze current and historical facts to make predictions about future or otherwise unknown events. In busine ...

Quantum machine learning Quantum machine learning is the integration of quantum algorithms within machine learning programs. The most common use of the term refers to machine learning algorithms for the analysis of classical data executed on a quantum computer, i.e. quan ...

* Robot learning ** Developmental robotics

Applications of machine learning

* Applications of machine learning *

Bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...

Biomedical informatics Health informatics is the field of science and engineering that aims at developing methods and technologies for the acquisition, processing, and study of patient data, which can come from different sources and modalities, such as electronic hea ...

Computer vision Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the human ...

Customer relationship management Customer relationship management (CRM) is a process in which a business or other organization administers its interactions with customers, typically using data analysis to study large amounts of information. CRM systems compile data from a r ...

– * Data mining *

Earth sciences Earth science or geoscience includes all fields of natural science related to the planet Earth. This is a branch of science dealing with the physical, chemical, and biological complex constitutions and synergistic linkages of Earth's four spheres ...

* Email filtering * Inverted pendulum – balance and equilibrium system. *

Natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...

(NLP) **

Named Entity Recognition Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre ...

Automatic summarization Automatic summarization is the process of shortening a set of data computationally, to create a subset (a summary) that represents the most important or relevant information within the original content. Artificial intelligence algorithms are comm ...

** Automatic taxonomy construction ** Dialog system ** Grammar checker ** Language recognition ***

Handwriting recognition Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other de ...

***

Optical character recognition Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a sc ...

***

Speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ...

**** Text to Speech Synthesis (TTS) **** Speech Emotion Recognition (SER) **

Machine translation Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates ...

Question answering Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP), which is concerned with building systems that automatically answer questions posed by humans in a natural ...

Speech synthesis Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal langua ...

Text mining Text mining, also referred to as ''text data mining'', similar to text analytics, is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extract ...

***

Term frequency–inverse document frequency Term may refer to: *Terminology, or term, a noun or compound word used in a specific context, in particular: **Technical term, part of the specialized vocabulary of a particular field, specifically: ***Scientific terminology, terms used by scienti ...

(tf–idf) **

Text simplification Text simplification is an operation used in natural language processing to change, enhance, classify, or otherwise process an existing body of human-readable text so its grammar and structure is greatly simplified while the underlying meaning and ...

Pattern recognition Pattern recognition is the automated recognition of patterns and regularities in data. It has applications in statistical data analysis, signal processing, image analysis, information retrieval, bioinformatics, data compression, computer graphics ...

Facial recognition system A facial recognition system is a technology capable of matching a human face from a digital image or a video frame against a database of faces. Such a system is typically employed to authenticate users through ID verification services, and ...

** Image recognition **

* Recommendation system **

Collaborative filtering Collaborative filtering (CF) is a technique used by recommender systems.Francesco Ricci and Lior Rokach and Bracha ShapiraIntroduction to Recommender Systems Handbook Recommender Systems Handbook, Springer, 2011, pp. 1-35 Collaborative filtering ...

** Content-based filtering ** Hybrid recommender systems (Collaborative and content-based filtering) *

Search engine A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...

Search engine optimization Search engine optimization (SEO) is the process of improving the quality and quantity of website traffic to a website or a web page from search engines. SEO targets unpaid traffic (known as "natural" or "organic" results) rather than dire ...

* Social Engineering

Machine learning hardware

Graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, m ...

Tensor processing unit Tensor Processing Unit (TPU) is an AI accelerator application-specific integrated circuit (ASIC) developed by Google for neural network machine learning, using Google's own TensorFlow software. Google began using TPUs internally in 2015, and ...

Vision processing unit A vision processing unit (VPU) is (as of 2018) an emerging class of microprocessor; it is a specific type of AI accelerator, designed to accelerate machine vision tasks. Overview Vision processing units are distinct from video processing uni ...

Machine learning tools

Comparison of deep learning software The following table compares notable software frameworks, libraries and computer programs for deep learning. Deep-learning software by name Comparison of compatibility of machine learning models See also *Comparison of numerical-analy ...

Machine learning frameworks

Proprietary machine learning frameworks

* Amazon Machine Learning * Microsoft Azure Machine Learning Studio *

DistBelief TensorFlow is a Free and open-source software, free and open-source Library (computing), software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on Types of artificial ...

– replaced by TensorFlow

Open source machine learning frameworks

* Apache Singa * Apache MXNet * Caffe * PyTorch * mlpack *

TensorFlow TensorFlow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks. "It is machine learnin ...

Torch A torch is a stick with combustible material at one end, which is ignited and used as a light source. Torches have been used throughout history, and are still used in processions, symbolic and religious events, and in juggling entertainment. I ...

* CNTK * Accord.Net * Jax

Machine learning libraries

Deeplearning4j Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, ...

Theano In Greek mythology, Theano (; Ancient Greek: Θεανώ) may refer to the following personages: *Theano, wife of Metapontus, king of Icaria. Metapontus demanded that she bear him children, or leave the kingdom. She presented the children of Mel ...

* scikit-learn * Keras

Machine learning algorithms

* Almeida–Pineda recurrent backpropagation * ALOPEX * Backpropagation * Bootstrap aggregating *

CN2 algorithm The CN2 induction algorithm is a learning algorithm for rule induction.Clark, P. and Niblett, T (1989) The CN2 induction algorithm. Machine Learning 3(4):261-283. It is designed to work even when the training data is imperfect. It is based on ide ...

* Constructing skill trees * Dehaene–Changeux model *

Diffusion map Diffusion maps is a dimensionality reduction or feature extraction algorithm introduced by Ronald Coifman, Coifman and Lafon which computes a family of embeddings of a data set into Euclidean space (often low-dimensional) whose coordinates can be ...

Dominance-based rough set approach The dominance-based rough set approach (DRSA) is an extension of rough set theory for multi-criteria decision analysis (MCDA), introduced by Greco, Matarazzo and Słowiński. Greco, S., Matarazzo, B., Słowiński, R.: Rough sets theory for multi- ...

Dynamic time warping In time series analysis, dynamic time warping (DTW) is an algorithm for measuring similarity between two temporal sequences, which may vary in speed. For instance, similarities in walking could be detected using DTW, even if one person was walk ...

* Error-driven learning * Evolutionary multimodal optimization * Expectation–maximization algorithm *

FastICA FastICA is an efficient and popular algorithm for independent component analysis invented by Aapo Hyvärinen at Helsinki University of Technology. Like most ICA algorithms, FastICA seeks an orthogonal rotation of prewhitened data, through a fixed- ...

Forward–backward algorithm The forward–backward algorithm is an inference algorithm for hidden Markov models which computes the posterior marginals of all hidden state variables given a sequence of observations/emissions o_:= o_1,\dots,o_T, i.e. it computes, for all hi ...

* GeneRec * Genetic Algorithm for Rule Set Production * Growing self-organizing map * Hyper basis function network * IDistance * K-nearest neighbors algorithm * Kernel methods for vector output * Kernel principal component analysis * Leabra *

Linde–Buzo–Gray algorithm The Linde–Buzo–Gray algorithm (introduced by Yoseph Linde, Andrés Buzo and Robert M. Gray in 1980) is a vector quantization algorithm to derive a good codebook A codebook is a type of document used for gathering and storing cryptography ...

Local outlier factor In anomaly detection, the local outlier factor (LOF) is an algorithm proposed by Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng and Jörg Sander in 2000 for finding anomalous data points by measuring the local deviation of a given data point ...

* Logic learning machine * LogitBoost *

Manifold alignment Manifold alignment is a class of machine learning algorithms that produce projections between sets of data, given that the original data sets lie on a common manifold. The concept was first introduced as such by Ham, Lee, and Saul in 2003, adding ...

* Markov chain Monte Carlo (MCMC) * Minimum redundancy feature selection *

Mixture of experts Mixture of experts (MoE) refers to a machine learning technique where multiple expert networks (learners) are used to divide a problem space into homogeneous regions. It differs from ensemble techniques in that typically only a few, or 1, expert mo ...

Multiple kernel learning Multiple kernel learning refers to a set of machine learning methods that use a predefined set of kernels and learn an optimal linear or non-linear combination of kernels as part of the algorithm. Reasons to use multiple kernel learning include ...

* Non-negative matrix factorization *

Online machine learning In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques whi ...

* Out-of-bag error * Prefrontal cortex basal ganglia working memory * PVLV * Q-learning * Quadratic unconstrained binary optimization * Query-level feature *

Quickprop Quickprop is an iterative method for determining the minimum of the loss function of an artificial neural network, following an algorithm inspired by the Newton's method. Sometimes, the algorithm is classified to the group of the second order lear ...

Radial basis function network In the field of mathematical modeling, a radial basis function network is an artificial neural network that uses radial basis functions as activation functions. The output of the network is a linear combination of radial basis functions of the inp ...

* Randomized weighted majority algorithm * Reinforcement learning * Repeated incremental pruning to produce error reduction (RIPPER) * Rprop *

Rule-based machine learning Rule-based machine learning (RBML) is a term in computer science intended to encompass any machine learning method that identifies, learns, or evolves 'rules' to store, manipulate or apply. The defining characteristic of a rule-based machine lear ...

* Skill chaining * Sparse PCA * State–action–reward–state–action *

Stochastic gradient descent Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties (e.g. differentiable or subdifferentiable). It can be regarded as a stochastic approximation of ...

* Structured kNN *

T-distributed stochastic neighbor embedding t-distributed stochastic neighbor embedding (t-SNE) is a statistical method for visualizing high-dimensional data by giving each datapoint a location in a two or three-dimensional map. It is based on Stochastic Neighbor Embedding originally de ...

* Temporal difference learning * Wake-sleep algorithm *

Weighted majority algorithm (machine learning) In machine learning, weighted majority algorithm (WMA) is a meta learning algorithm used to construct a compound algorithm from a pool of prediction algorithms, which could be any type of learning algorithms, classifiers, or even real human exper ...

Machine learning methods

Instance-based algorithm

* K-nearest neighbors algorithm (KNN) * Learning vector quantization (LVQ) * Self-organizing map (SOM)

Regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...

Logistic regression In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables. In regression an ...

* Ordinary least squares regression (OLSR) *

Linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...

Stepwise regression In statistics, stepwise regression is a method of fitting regression models in which the choice of predictive variables is carried out by an automatic procedure. In each step, a variable is considered for addition to or subtraction from the set of ...

* Multivariate adaptive regression splines (MARS) * Regularization algorithm **

Ridge regression Ridge regression is a method of estimating the coefficients of multiple-regression models in scenarios where the independent variables are highly correlated. It has been used in many fields including econometrics, chemistry, and engineering. Also ...

** Least Absolute Shrinkage and Selection Operator (LASSO) ** Elastic net **

Least-angle regression In statistics, least-angle regression (LARS) is an algorithm for fitting linear regression models to high-dimensional data, developed by Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani. Suppose we expect a response variab ...

(LARS) * Classifiers ** Probabilistic classifier ***

Naive Bayes classifier In statistics, naive Bayes classifiers are a family of simple " probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features (see Bayes classifier). They are among the simplest Bay ...

Binary classifier Binary classification is the task of classifying the elements of a set into two groups (each called ''class'') on the basis of a classification rule. Typical binary classification problems include: * Medical testing to determine if a patient has ...

Linear classifier In the field of machine learning, the goal of statistical classification is to use an object's characteristics to identify which class (or group) it belongs to. A linear classifier achieves this by making a classification decision based on the v ...

** Hierarchical classifier

Dimensionality reduction

Dimensionality reduction Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...

Canonical correlation analysis In statistics, canonical-correlation analysis (CCA), also called canonical variates analysis, is a way of inferring information from cross-covariance matrices. If we have two vectors ''X'' = (''X''1, ..., ''X'n'') and ''Y' ...

(CCA) *

Factor analysis Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observed ...

* Feature extraction * Feature selection *

Independent component analysis In signal processing, independent component analysis (ICA) is a computational method for separating a multivariate signal into additive subcomponents. This is done by assuming that at most one subcomponent is Gaussian and that the subcomponents ar ...

(ICA) *

Linear discriminant analysis Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features ...

(LDA) *

Multidimensional scaling Multidimensional scaling (MDS) is a means of visualizing the level of similarity of individual cases of a dataset. MDS is used to translate "information about the pairwise 'distances' among a set of n objects or individuals" into a configurati ...

(MDS) * Non-negative matrix factorization (NMF) *

Partial least squares regression Partial least squares regression (PLS regression) is a statistical method that bears some relation to principal components regression; instead of finding hyperplanes of maximum variance between the response and independent variables, it finds a ...

(PLSR) *

Principal component analysis Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while preserving the maximum amount of information, and ...

(PCA) * Principal component regression (PCR) *

Projection pursuit Projection pursuit (PP) is a type of statistical technique which involves finding the most "interesting" possible projections in multidimensional data. Often, projections which deviate more from a normal distribution are considered to be more inter ...

* Sammon mapping *

t-distributed stochastic neighbor embedding t-distributed stochastic neighbor embedding (t-SNE) is a statistical method for visualizing high-dimensional data by giving each datapoint a location in a two or three-dimensional map. It is based on Stochastic Neighbor Embedding originally de ...

(t-SNE)

Ensemble learning

Ensemble learning * AdaBoost * Boosting * Bootstrap aggregating (Bagging) *

Ensemble averaging In machine learning, particularly in the creation of artificial neural networks, ensemble averaging is the process of creating multiple models and combining them to produce a desired output, as opposed to creating just one model. Frequently an ens ...

– process of creating multiple models and combining them to produce a desired output, as opposed to creating just one model. Frequently an ensemble of models performs better than any individual model, because the various errors of the models "average out." * Gradient boosted decision tree (GBDT) * Gradient boosting machine (GBM) *

Random Forest Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of ...

* Stacked Generalization (blending)

Meta-learning

Meta-learning * Inductive bias *

Metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...

Reinforcement learning

Reinforcement learning * Q-learning * State–action–reward–state–action (SARSA) * Temporal difference learning (TD) *

Learning Automata A learning automaton is one type of machine learning algorithm studied since 1970s. Learning automata select their current action based on past experiences from the environment. It will fall into the range of reinforcement learning if the environme ...

Supervised learning

Supervised learning * Averaged one-dependence estimators (AODE) *

Artificial neural network Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...

Case-based reasoning In artificial intelligence and philosophy, case-based reasoning (CBR), broadly construed, is the process of solving new problems based on the solutions of similar past problems. In everyday life, an auto mechanic who fixes an engine by recallin ...

Gaussian process regression In statistics, originally in geostatistics, kriging or Kriging, also known as Gaussian process regression, is a method of interpolation based on Gaussian process governed by prior covariances. Under suitable assumptions of the prior, kriging giv ...

* Gene expression programming * Group method of data handling (GMDH) * Inductive logic programming *

Instance-based learning In machine learning, instance-based learning (sometimes called memory-based learning) is a family of learning algorithms that, instead of performing explicit generalization, compare new problem instances with instances seen in training, which have ...

Lazy learning In machine learning, lazy learning is a learning method in which generalization of the training data is, in theory, delayed until a query is made to the system, as opposed to eager learning, where the system tries to generalize the training data be ...

* Learning Vector Quantization * Logistic Model Tree * Minimum message length (decision trees, decision graphs, etc.) ** Nearest Neighbor Algorithm **

Analogical modeling Analogical modeling (AM) is a formal theory of exemplar based analogical reasoning, proposed by Royal Skousen, professor of Linguistics and English language at Brigham Young University in Provo, Utah. It is applicable to language modeling and othe ...

* Probably approximately correct learning (PAC) learning *

Ripple down rules Ripple-down rules (RDR) are a way of approaching knowledge acquisition. Knowledge acquisition refers to the transfer of knowledge from human experts to knowledge-based systems. Introductory material Ripple-down rules are an incremental approac ...

, a knowledge acquisition methodology * Symbolic machine learning algorithms *

Support vector machine In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laborat ...

s * Random Forests * Ensembles of classifiers ** Bootstrap aggregating (bagging) ** Boosting (meta-algorithm) * Ordinal classification * Information fuzzy networks (IFN) * Conditional Random Field * ANOVA *

Quadratic classifier In statistics, a quadratic classifier is a statistical classifier that uses a quadratic decision surface to separate measurements of two or more classes of objects or events. It is a more general version of the linear classifier. The classifica ...

s *

k-nearest neighbor In statistics, the ''k''-nearest neighbors algorithm (''k''-NN) is a non-parametric supervised learning method first developed by Evelyn Fix and Joseph Hodges in 1951, and later expanded by Thomas Cover. It is used for classification and regressi ...

* Boosting ** SPRINT *

Bayesian network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Ba ...

s ** Naive Bayes *

Hidden Markov model A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ...

s **

Hierarchical hidden Markov model The hierarchical hidden Markov model (HHMM) is a statistical model derived from the hidden Markov model (HMM). In an HHMM, each state is considered to be a self-contained probabilistic model. More precisely, each state of the HHMM is itself an HHMM ...

Bayesian

Bayesian statistics Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...

* Bayesian knowledge base * Naive Bayes *

Gaussian Naive Bayes In statistics, naive Bayes classifiers are a family of simple " probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features (see Bayes classifier). They are among the simplest Baye ...

* Multinomial Naive Bayes * Averaged One-Dependence Estimators (AODE) * Bayesian Belief Network (BBN) *

Bayesian Network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Ba ...

(BN)

Decision tree algorithms

Decision tree algorithm *

Decision tree A decision tree is a decision support tool that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only contains con ...

* Classification and regression tree (CART) * Iterative Dichotomiser 3 (ID3) * C4.5 algorithm * C5.0 algorithm * Chi-squared Automatic Interaction Detection (CHAID) * Decision stump * Conditional decision tree *

ID3 algorithm In decision tree learning, ID3 (Iterative Dichotomiser 3) is an algorithm invented by Ross QuinlanQuinlan, J. R. 1986. Induction of Decision Trees. Mach. Learn. 1, 1 (Mar. 1986), 81–106 used to generate a decision tree from a dataset. ID3 is th ...

Random forest Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of ...

* SLIQ

Linear classifier

* Fisher's linear discriminant *

Multinomial logistic regression In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes. That is, it is a model that is used to predict the prob ...

* Perceptron *

Unsupervised learning

Unsupervised learning * Expectation-maximization algorithm *

Vector Quantization Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. It was originally used for data compression. It works by di ...

* Generative topographic map *

Information bottleneck method The information bottleneck method is a technique in information theory introduced by Naftali Tishby, Fernando C. Pereira, and William Bialek. It is designed for finding the best tradeoff between accuracy and complexity ( compression) when summarizi ...

Association rule learning Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.Pi ...

algorithms **

Apriori algorithm AprioriRakesh Agrawal and Ramakrishnan SrikanFast algorithms for mining association rules Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pages 487-499, Santiago, Chile, September 1994. is an algorithm for frequent ...

Eclat algorithm Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.Pi ...

Artificial neural networks

Feedforward neural network A feedforward neural network (FNN) is an artificial neural network wherein connections between the nodes do ''not'' form a cycle. As such, it is different from its descendant: recurrent neural networks. The feedforward neural network was the ...

Extreme learning machine Extreme learning machines are feedforward neural networks for classification, regression, clustering, sparse approximation, compression and feature learning with a single layer or multiple layers of hidden nodes, where the parameters of hidden n ...

Convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...

* Recurrent neural network ** Long short-term memory (LSTM) * Logic learning machine * Self-organizing map

Association rule learning

* FP-growth algorithm

Hierarchical clustering

Hierarchical clustering In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into t ...

* Single-linkage clustering *

Conceptual clustering Conceptual clustering is a machine learning paradigm for unsupervised classification that has been defined by Ryszard S. Michalski in 1980 (Fisher 1987, Michalski 1980) and developed mainly during the 1980s. It is distinguished from ordinary dat ...

Cluster analysis

Cluster analysis Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of ...

BIRCH A birch is a thin-leaved deciduous hardwood tree of the genus ''Betula'' (), in the family Betulaceae, which also includes alders, hazels, and hornbeams. It is closely related to the beech- oak family Fagaceae. The genus ''Betula'' cont ...

* DBSCAN * Expectation-maximization (EM) *

Fuzzy clustering Fuzzy clustering (also referred to as soft clustering or soft ''k''-means) is a form of clustering in which each data point can belong to more than one cluster. Clustering or cluster analysis involves assigning data points to clusters such that i ...

Hierarchical Clustering In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into t ...

K-means clustering ''k''-means clustering is a method of vector quantization, originally from signal processing, that aims to partition ''n'' observations into ''k'' clusters in which each observation belongs to the cluster with the nearest mean (cluster centers ...

* K-medians * Mean-shift * OPTICS algorithm

Anomaly detection

Anomaly detection * ''k''-nearest neighbors algorithm (''k''-NN) *

Semi-supervised learning

Semi-supervised learning *

Active learning Active learning is "a method of learning in which students are actively or experientially involved in the learning process and where there are different levels of active learning, depending on student involvement." states that "students partici ...

– special case of semi-supervised learning in which a learning algorithm is able to interactively query the user (or some other information source) to obtain the desired outputs at new data points. * Generative models * Low-density separation * Graph-based methods *

Co-training Co-training is a machine learning algorithm used when there are only small amounts of labeled data and large amounts of unlabeled data. One of its uses is in text mining for search engines. It was introduced by Avrim Blum and Tom Mitchell in 1998. ...

* Transduction

Deep learning

Deep learning Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised. ...

* Deep belief networks * Deep

Boltzmann machine A Boltzmann machine (also called Sherrington–Kirkpatrick model with external field or stochastic Ising–Lenz–Little model) is a stochastic spin-glass model with an external field, i.e., a Sherrington–Kirkpatrick model, that is a stochastic ...

s * Deep

s * Deep Recurrent neural networks *

Hierarchical temporal memory Hierarchical temporal memory (HTM) is a biologically constrained machine intelligence technology developed by Numenta. Originally described in the 2004 book ''On Intelligence'' by Jeff Hawkins with Sandra Blakeslee, HTM is primarily used today for ...

* Generative Adversarial Network ** Style transfer *

Transformer A transformer is a passive component that transfers electrical energy from one electrical circuit to another circuit, or multiple circuits. A varying current in any coil of the transformer produces a varying magnetic flux in the transformer' ...

* Stacked Auto-Encoders

Machine learning research

* List of artificial intelligence projects *

List of datasets for machine learning research These datasets are applied for machine learning research and have been cited in peer-reviewed academic journals. Datasets are an integral part of the field of machine learning. Major advances in this field can result from advances in learning a ...

History of machine learning

History of machine learning * Timeline of machine learning

Machine learning projects

Machine learning projects *

DeepMind DeepMind Technologies is a British artificial intelligence subsidiary of Alphabet Inc. and research laboratory founded in 2010. DeepMind was acquired by Google in 2014 and became a wholly owned subsidiary of Alphabet Inc, after Google's restru ...

Google Brain Google Brain is a deep learning artificial intelligence research team under the umbrella of Google AI, a research division at Google dedicated to artificial intelligence. Formed in 2011, Google Brain combines open-ended machine learning research ...

OpenAI OpenAI is an artificial intelligence (AI) research laboratory consisting of the for-profit corporation OpenAI LP and its parent company, the non-profit OpenAI Inc. The company conducts research in the field of AI with the stated goal of promo ...

* Meta AI

Machine learning organizations

Machine learning organizations *

Knowledge Engineering and Machine Learning Group The Knowledge Engineering and Machine Learning group (KEMLg) is a research group belonging to the Technical University of Catalonia (UPC) – BarcelonaTech. It was founded by Prof. Ulises Cortés. The group has been active in the Artificial I ...

Machine learning conferences and workshops

* Artificial Intelligence and Security (AISec) (co-located workshop with CCS) *

Conference on Neural Information Processing Systems The Conference and Workshop on Neural Information Processing Systems (abbreviated as NeurIPS and formerly NIPS) is a machine learning and computational neuroscience conference held every December. The conference is currently a double-track meet ...

(NIPS) * ECML PKDD *

International Conference on Machine Learning The International Conference on Machine Learning (ICML) is the leading international academic conference in machine learning. Along with NeurIPS and ICLR, it is one of the three primary conferences of high impact in machine learning and artific ...

(ICML)
ML4ALL
(Machine Learning For All)

Machine learning publications

Books on machine learning

Books about machine learning

Machine learning journals

* ''

Machine Learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

'' * ''

Journal of Machine Learning Research The ''Journal of Machine Learning Research'' is a peer-reviewed open access scientific journal covering machine learning. It was established in 2000 and the first editor-in-chief was Leslie Kaelbling. The current editors-in-chief are Francis Bac ...

'' (JMLR) * ''

Neural Computation Neural computation is the information processing performed by networks of neurons. Neural computation is affiliated with the philosophical tradition known as Computational theory of mind, also referred to as computationalism, which advances the t ...

Persons influential in machine learning

* Alberto Broggi * Andrei Knyazev *

Andrew McCallum Andrew McCallum is a professor in the computer science department at University of Massachusetts Amherst. His primary specialties are in machine learning, natural language processing, information extraction, information integration, and socia ...

Andrew Ng Andrew Yan-Tak Ng (; born 1976) is a British-born American computer scientist and technology entrepreneur focusing on machine learning and AI. Ng was a co-founder and head of Google Brain and was the former Chief Scientist at Baidu, buildin ...

* Anuraag Jain * Armin B. Cremers * Ayanna Howard * Barney Pell * Ben Goertzel *

Ben Taskar Ben Taskar (March 3, 1977 – November 18, 2013) was a professor and researcher in the area of machine learning and applications to computational linguistics and computer vision. He was a Magerman Term Associate Professor for Computer and Infor ...

Bernhard Schölkopf Bernhard Schölkopf is a German computer scientist (born 20 February 1968) known for his work in machine learning, especially on kernel methods and causality. He is a director at the Max Planck Institute for Intelligent Systems in Tübingen, ...

* Brian D. Ripley * Christopher G. Atkeson * Corinna Cortes * Demis Hassabis * Douglas Lenat *

Eric Xing Eric Poe Xing is an American computer scientist, academic administrator, and entrepreneur. Prior to his appointment as President of MBZUAI, Xing was a professor in the School of Computer Science at Carnegie Mellon University and researcher in mac ...

Ernst Dickmanns Ernst Dieter Dickmanns is a German pioneer of dynamic computer vision and of driverless cars. Dickmanns has been a professor at Bundeswehr University Munich (1975–2001), and visiting professor to Caltech and to MIT, teaching courses on "dynami ...

* Geoffrey Hinton – co-inventor of the backpropagation and contrastive divergence training algorithms * Hans-Peter Kriegel * Hartmut Neven * Heikki Mannila *

Ian Goodfellow Ian J. Goodfellow (born ) is a computer scientist, engineer, and executive, most noted for his work on artificial neural networks and deep learning. He was previously employed as a research scientist at Google Brain and director of machine learn ...

– Father of Generative & adversarial networks * Jacek M. Zurada *

Jaime Carbonell Jaime Guillermo Carbonell (July 29, 1953 – February 28, 2020) was a computer scientist who made seminal contributions to the development of natural language processing tools and technologies. His extensive research in machine translation result ...

* Jeremy Slovak *

Jerome H. Friedman Jerome Harold Friedman (born December 29, 1939) is an American statistician, consultant and Professor of Statistics at Stanford University, known for his contributions in the field of statistics and data mining.

John D. Lafferty John D. Lafferty is an American scientist, Professor at Yale University and leading researcher in machine learning. He is best known for proposing the Conditional Random Fields with Andrew McCallum and Fernando C.N. Pereira. Biography In 2017, ...

* John Platt – invented SMO and Platt scaling * Julie Beth Lovins *

Jürgen Schmidhuber Jürgen Schmidhuber (born 17 January 1963) is a German computer scientist most noted for his work in the field of artificial intelligence, deep learning and artificial neural networks. He is a co-director of the Dalle Molle Institute for Artific ...

* Karl Steinbuch *

Katia Sycara Ekaterini Panagiotou Sycara ( el, Κάτια Συκαρά) is a Greek computer scientist. She is an Edward Fredkin Research Professor of Robotics in the Robotics Institute, School of Computer Science at Carnegie Mellon University internationally ...

Leo Breiman Leo Breiman (January 27, 1928 – July 5, 2005) was a distinguished statistician at the University of California, Berkeley. He was the recipient of numerous honors and awards, and was a member of the United States National Academy of Sciences ...

– invented bagging and random forests *

Lise Getoor Lise Getoor is a professor in the computer science department, at the University of California, Santa Cruz, and an adjunct professor in the Computer Science Department at the University of Maryland, College Park. Her primary research interests ...

Luca Maria Gambardella Luca Maria Gambardella (born 4 January 1962) is an Italian computer scientist and author. He is the former director of the Dalle Molle Institute for Artificial Intelligence Research in Manno, in the Ticino canton of Switzerland. With Marco Do ...

Léon Bottou Léon Bottou (born 1965) is a researcher best known for his work in machine learning and data compression. His work presents stochastic gradient descent as a fundamental learning algorithm. He is also one of the main creators of the DjVu image comp ...

* Marcus Hutter * Mehryar Mohri *

Michael Collins Michael Collins or Mike Collins most commonly refers to: * Michael Collins (Irish leader) (1890–1922), Irish revolutionary leader, soldier, and politician * Michael Collins (astronaut) (1930–2021), American astronaut, member of Apollo 11 and ...

Michael I. Jordan Michael Irwin Jordan (born February 25, 1956) is an American scientist, professor at the University of California, Berkeley and researcher in machine learning, statistics, and artificial intelligence. Jordan was elected a member of the Nat ...

* Michael L. Littman *

Nando de Freitas Nando de Freitas is a researcher in the field of machine learning, and in particular in the subfields of neural networks, Bayesian inference and Bayesian optimization, and deep learning. Biography De Freitas was born in Zimbabwe. He did his un ...

* Ofer Dekel *

Oren Etzioni Oren Etzioni (born 1964) is an American entrepreneur, Professor Emeritus of computer science, and founding CEO of the Allen Institute for Artificial Intelligence (AI2). On June 15, 2022, he announced that he will step down as CEO of AI2 effective ...

Pedro Domingos Pedro Domingos is a Professor Emeritus of computer science and engineering at the University of Washington. He is a researcher in machine learning known for Markov logic network enabling uncertain inference. Education Domingos received an un ...

* Peter Flach * Pierre Baldi * Pushmeet Kohli *

Ray Kurzweil Raymond Kurzweil ( ; born February 12, 1948) is an American computer scientist, author, inventor, and futurist. He is involved in fields such as optical character recognition (OCR), text-to-speech synthesis, speech recognition technology, and e ...

* Rayid Ghani * Ross Quinlan * Salvatore J. Stolfo * Sebastian Thrun *

Selmer Bringsjord Selmer Bringsjord (born November 24, 1958) is the chair of the Department of Cognitive Science at Rensselaer Polytechnic Institute and a professor of Computer Science and Cognitive Science. He also holds an appointment in the Lally School of ...

Sepp Hochreiter Josef "Sepp" Hochreiter (born 14 February 1967) is a German computer scientist. Since 2018 he has led the Institute for Machine Learning at the Johannes Kepler University of Linz after having led the Institute of Bioinformatics from 2006 to 2018 ...

* Shane Legg *

Stephen Muggleton Stephen H. Muggleton FBCS, FIET, FAAAI, FECCAI, FSB, FREng (born 6 December 1959, son of Louis Muggleton) is Professor of Machine Learning and Head of the Computational Bioinformatics Laboratory at Imperial College London.Steve Omohundro * Tom M. Mitchell * Trevor Hastie *

Vasant Honavar Vasant G. Honavar is an Indian born American computer scientist, and artificial intelligence, machine learning, big data, data science, causal inference, knowledge representation, bioinformatics and health informatics researcher and professor. ...

Vladimir Vapnik Vladimir Naumovich Vapnik (russian: Владимир Наумович Вапник; born 6 December 1936) is one of the main developers of the Vapnik–Chervonenkis theory of statistical learning, and the co-inventor of the support-vector machin ...

– co-inventor of the SVM and VC theory *

Yann LeCun Yann André LeCun ( , ; originally spelled Le Cun; born 8 July 1960) is a French computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Professo ...

– invented convolutional neural networks * Yasuo Matsuyama *

Yoshua Bengio Yoshua Bengio (born March 5, 1964) is a Canadian computer scientist, most noted for his work on artificial neural networks and deep learning. He is a professor at the Department of Computer Science and Operations Research at the Université de ...

* Zoubin Ghahramani

Determining the number of clusters in a data set Determining the number of clusters in a data set, a quantity often labelled ''k'' as in the ''k''-means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actually solving the clustering problem. For a ...

* Detrended correspondence analysis * Developmental robotics * Diffbot * Differential evolution * Discrete phase-type distribution * Discriminative model * Dissociated press * Distributed R * Dlib * Document classification * Documenting Hate * Domain adaptation * Doubly stochastic model * Dual-phase evolution * Dunn index * Dynamic Bayesian network * Dynamic Markov compression * Dynamic topic model * Dynamic unobserved effects model * EDLUT * ELKI * Edge recombination operator * Effective fitness * Elastic map * Elastic matching * Elbow method (clustering) * Emergent (software) * Encog * Entropy rate * Erkki Oja * Eurisko * European Conference on Artificial Intelligence * Evaluation of binary classifiers * Evolution strategy * Evolution window * Evolutionary Algorithm for Landmark Detection * Evolutionary algorithm * Evolutionary art * Evolutionary music * Evolutionary programming * Evolvability (computer science) * Evolved antenna * Evolver (software) * Evolving classification function * Expectation propagation * Exploratory factor analysis * F1 score * FLAME clustering * Factor analysis of mixed data * Factor graph * Factor regression model * Factored language model * Farthest-first traversal * Fast-and-frugal trees * Feature Selection Toolbox * Feature hashing * Feature scaling * Feature vector * Firefly algorithm * First-difference estimator * First-order inductive learner * Fish School Search * Fisher kernel * Fitness approximation * Fitness function * Fitness proportionate selection * Fluentd * Folding@home * Formal concept analysis * Forward algorithm * Fowlkes–Mallows index * Frederick Jelinek * Frrole * Functional principal component analysis * GATTO * GLIMMER * Gary Bryce Fogel * Gaussian adaptation * Gaussian process * Gaussian process emulator * Gene prediction * General Architecture for Text Engineering * Generalization error * Generalized canonical correlation * Generalized filtering * Generalized iterative scaling * Generalized multidimensional scaling * Generative adversarial network * Generative model * Genetic algorithm * Genetic algorithm scheduling * Genetic algorithms in economics * Genetic fuzzy systems * Genetic memory (computer science) * Genetic operator * Genetic programming * Genetic representation * Geographical cluster * Gesture Description Language * Geworkbench * Glossary of artificial intelligence * Glottochronology * Golem (ILP) * Google matrix * Grafting (decision trees) * Gramian matrix * Grammatical evolution * Granular computing * GraphLab * Graph kernel * Gremlin (programming language) * Growth function * HUMANT (HUManoid ANT) algorithm * Hammersley–Clifford theorem * Harmony search * Hebbian theory * Hidden Markov random field * Hidden semi-Markov model *

* Higher-order factor analysis * Highway network * Hinge loss * Holland's schema theorem * Hopkins statistic * Hoshen–Kopelman algorithm * Huber loss * IRCF360 *

* Ilastik * Ilya Sutskever * Immunocomputing * Imperialist competitive algorithm * Inauthentic text * Incremental decision tree * Induction of regular languages * Inductive bias * Inductive probability * Inductive programming * Influence diagram * Information Harvesting * Information fuzzy networks * Information gain in decision trees * Information gain ratio * Inheritance (genetic algorithm) * Instance selection * Intel RealSense * Interacting particle system * Interactive machine translation * International Joint Conference on Artificial Intelligence * International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics * International Semantic Web Conference * Iris flower data set * Island algorithm * Isotropic position * Item response theory * Iterative Viterbi decoding * JOONE * Jabberwacky * Jaccard index * Jackknife variance estimates for random forest * Java Grammatical Evolution * Joseph Nechvatal * Jubatus * Julia (programming language) * Junction tree algorithm * K-SVD * K-means++ * K-medians clustering * K-medoids * KNIME * KXEN Inc. * K q-flats * Kaggle * Kalman filter * Katz's back-off model * Kernel adaptive filter * Kernel density estimation * Kernel eigenvoice * Kernel embedding of distributions * Kernel method * Kernel perceptron * Kernel random forest * Kinect * Klaus-Robert Müller * Kneser–Ney smoothing * Knowledge Vault * Knowledge integration * LIBSVM * LPBoost * Labeled data * LanguageWare * Language identification in the limit * Language model * Large margin nearest neighbor * Latent Dirichlet allocation * Latent class model * Latent semantic analysis * Latent variable * Latent variable model * Lattice Miner * Layered hidden Markov model * Learnable function class * Least squares support vector machine * Leave-one-out error * Leslie P. Kaelbling * Linear genetic programming * Linear predictor function * Linear separability * Lingyun Gu * Linkurious * Lior Ron (business executive) * List of genetic algorithm applications * List of metaphor-based metaheuristics * List of text mining software * Local case-control sampling * Local independence * Local tangent space alignment * Locality-sensitive hashing * Log-linear model * Logistic model tree * Low-rank approximation * Low-rank matrix approximations * MATLAB * MIMIC (immunology) * MXNet * Mallet (software project) * Manifold regularization * Margin-infused relaxed algorithm * Margin classifier * Mark V. Shaney * Massive Online Analysis * Matrix regularization * Matthews correlation coefficient * Mean shift *

* Mean squared prediction error * Measurement invariance * Medoid * MeeMix * Melomics * Memetic algorithm * Meta-optimization * Mexican International Conference on Artificial Intelligence * Michael Kearns (computer scientist) * MinHash * Mixture model * Mlpy * Models of DNA evolution * Moral graph * Mountain car problem * Movidius *

* Multi expression programming * Multiclass classification * Multidimensional analysis * Multifactor dimensionality reduction * Multilinear principal component analysis * Multiple correspondence analysis * Multiple discriminant analysis * Multiple factor analysis * Multiple sequence alignment * Multiplicative weight update method * Multispectral pattern recognition * Mutation (genetic algorithm) * MysteryVibe * N-gram * NOMINATE (scaling method) * Native-language identification * Natural Language Toolkit * Natural evolution strategy * Nearest-neighbor chain algorithm * Nearest centroid classifier * Nearest neighbor search * Neighbor joining * Nest Labs * NetMiner * NetOwl * Neural Designer * Neural Engineering Object * Neural Lab * Neural modeling fields * Neural network software * NeuroSolutions * Neuro Laboratory * Neuroevolution * Neuroph * Niki.ai * Noisy channel model * Noisy text analytics * Nonlinear dimensionality reduction * Novelty detection * Nuisance variable * One-class classification * Onnx * OpenNLP * Optimal discriminant analysis * Oracle Data Mining * Orange (software) * Ordination (statistics) * Overfitting * PROGOL * PSIPRED * Pachinko allocation * PageRank * Parallel metaheuristic * Parity benchmark * Part-of-speech tagging * Particle swarm optimization * Path dependence * Pattern language (formal languages) * Peltarion Synapse * Perplexity * Persian Speech Corpus * Picas (app) * Pietro Perona * Pipeline Pilot * Piranha (software) * Pitman–Yor process * Plate notation * Polynomial kernel * Pop music automation * Population process * Portable Format for Analytics * Predictive Model Markup Language * Predictive state representation * Preference regression * Premature convergence * Principal geodesic analysis * Prior knowledge for pattern recognition * Prisma (app) * Probabilistic Action Cores * Probabilistic context-free grammar * Probabilistic latent semantic analysis * Probabilistic soft logic * Probability matching * Probit model * Product of experts * Programming with Big Data in R * Proper generalized decomposition * Pruning (decision trees) * Pushpak Bhattacharyya * Q methodology * Qloo * Quality control and genetic algorithms * Quantum Artificial Intelligence Lab * Queueing theory * Quick, Draw! * R (programming language) * Rada Mihalcea * Rademacher complexity * Radial basis function kernel * Rand index * Random indexing * Random projection * Random subspace method * Ranking SVM * RapidMiner * Rattle GUI * Raymond Cattell * Reasoning system * Regularization perspectives on support vector machines * Relational data mining * Relationship square * Relevance vector machine * Relief (feature selection) * Renjin * Repertory grid * Representer theorem * Reward-based selection * Richard Zemel * Right to explanation * RoboEarth * Robust principal component analysis * RuleML Symposium * Rule induction * Rules extraction system family * SAS (software) * SNNS * SPSS Modeler * SUBCLU * Sample complexity * Sample exclusion dimension * Santa Fe Trail problem * Savi Technology * Schema (genetic algorithms) * Search-based software engineering * Selection (genetic algorithm) * Self-Service Semantic Suite * Semantic folding * Semantic mapping (statistics) * Semidefinite embedding * Sense Networks * Sensorium Project * Sequence labeling * Sequential minimal optimization * Shattered set * Shogun (toolbox) * Silhouette (clustering) * SimHash * SimRank * Similarity measure * Simple matching coefficient * Simultaneous localization and mapping * Sinkov statistic * Sliced inverse regression * Snakes and Ladders * Soft independent modelling of class analogies * Soft output Viterbi algorithm * Solomonoff's theory of inductive inference * SolveIT Software * Spectral clustering * Spike-and-slab variable selection * Statistical machine translation * Statistical parsing * Statistical semantics * Stefano Soatto * Stephen Wolfram * Stochastic block model * Stochastic cellular automaton * Stochastic diffusion search * Stochastic grammar * Stochastic matrix * Stochastic universal sampling * Stress majorization * String kernel * Structural equation modeling * Structural risk minimization * Structured sparsity regularization * Structured support vector machine * Subclass reachability * Sufficient dimension reduction * Sukhotin's algorithm * Sum of absolute differences * Sum of absolute transformed differences * Swarm intelligence * Switching Kalman filter * Symbolic regression * Synchronous context-free grammar * Syntactic pattern recognition * TD-Gammon * TIMIT * Teaching dimension * Teuvo Kohonen * Textual case-based reasoning * Theory of conjoint measurement * Thomas G. Dietterich * Thurstonian model * Topic model * Tournament selection * Training, test, and validation sets * Transiogram * Trax Image Recognition * Trigram tagger * Truncation selection * Tucker decomposition * UIMA * UPGMA * Ugly duckling theorem * Uncertain data * Uniform convergence in probability * Unique negative dimension * Universal portfolio algorithm * User behavior analytics * VC dimension * VIGRA * Validation set * Vapnik–Chervonenkis theory * Variable-order Bayesian network * Variable kernel density estimation * Variable rules analysis * Variational message passing * Varimax rotation * Vector quantization * Vicarious (company) * Viterbi algorithm * Vowpal Wabbit * WACA clustering algorithm * WPGMA * Ward's method * Weasel program * Whitening transformation * Winnow (algorithm) * Win–stay, lose–switch * Witness set * Wolfram Language * Wolfram Mathematica * Writer invariant * Xgboost * Yooreeka * Zeroth (software)

References

External links

Data Science: Data to Insights from MIT (machine learning)
* Popular online course by

, a
Coursera
It uses GNU Octave. The course is a free version of Stanford University's actual course taught by Ng, see.stanford.edu/Course/CS229 available for free].
mloss
is an academic database of open-source machine learning software. {{Outline footer Outlines of applied sciences, Machine learning Wikipedia outlines, Machine learning Computing-related lists Machine learning, * Data mining, Machine learning