Massive Online Analysis (MOA) is a free
open-source software
Open-source software (OSS) is Software, computer software that is released under a Open-source license, license in which the copyright holder grants users the rights to use, study, change, and Software distribution, distribute the software an ...
project specific for
data stream mining Data Stream Mining (also known as stream learning) is the process of extracting knowledge structures from continuous, rapid data records. A data stream is an ordered sequence of instances that in many applications of data stream mining can be read o ...
with
concept drift
In predictive analytics, data science, machine learning and related fields, concept drift or drift is an evolution of data that invalidates the data model. It happens when the statistical properties of the target variable, which the model is trying ...
. It is written in
Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
and developed at the
University of Waikato
The University of Waikato (), established in 1964, is a Public university, public research university located in Hamilton, New Zealand, Hamilton, New Zealand. An additional campus is located in Tauranga.
The university performs research in nume ...
,
New Zealand
New Zealand () is an island country in the southwestern Pacific Ocean. It consists of two main landmasses—the North Island () and the South Island ()—and List of islands of New Zealand, over 600 smaller islands. It is the List of isla ...
.
Description
MOA is an open-source framework software that allows to build and run experiments
of machine learning or data mining on evolving data streams. It includes a set of learners and stream generators that can be used from the graphical user interface (GUI), the command-line, and the Java API.
MOA contains several collections of machine learning algorithms:
*
Classification
Classification is the activity of assigning objects to some pre-existing classes or categories. This is distinct from the task of establishing the classes themselves (for example through cluster analysis). Examples include diagnostic tests, identif ...
** Bayesian classifiers
*** Naive Bayes
*** Naive Bayes Multinomial
** Decision trees classifiers
*** Decision Stump
*** Hoeffding Tree
*** Hoeffding Option Tree
*** Hoeffding Adaptive Tree
** Meta classifiers
*** Bagging
*** Boosting
*** Bagging using ADWIN
*** Bagging using Adaptive-Size Hoeffding Trees.
*** Perceptron Stacking of Restricted Hoeffding Trees
*** Leveraging Bagging
*** Online Accuracy Updated Ensemble
** Function classifiers
*** Perceptron
***
Stochastic gradient descent
Stochastic gradient descent (often abbreviated SGD) is an Iterative method, iterative method for optimizing an objective function with suitable smoothness properties (e.g. Differentiable function, differentiable or Subderivative, subdifferentiable ...
(SGD)
*** Pegasos
** Drift classifiers
***Self-Adjusting Memory
***Probabilistic Adaptive Windowing
** Multi-label classifiers
**
Active learning
Active learning is "a method of learning in which students are actively or experientially involved in the learning process and where there are different levels of active learning, depending on student involvement." states that "students particip ...
classifiers
*
Regression
** FIMTDD
** AMRules
*
Clustering
** StreamKM++
** CluStream
** ClusTree
** D-Stream
** CobWeb.
* Outlier detection
** STORM
** Abstract-C
** COD
** MCOD
** AnyOut
*
Recommender system
A recommender system (RecSys), or a recommendation system (sometimes replacing ''system'' with terms such as ''platform'', ''engine'', or ''algorithm'') and sometimes only called "the algorithm" or "algorithm", is a subclass of information fi ...
s
** BRISMFPredictor
*
Frequent pattern mining
** Itemsets
** Graphs
* Change detection algorithms
These algorithms are designed for large scale machine learning, dealing with concept drift, and big data streams in real time.
MOA supports bi-directional interaction with
Weka
The weka, also known as the Māori hen or woodhen (''Gallirallus australis'') is a flightless bird species of the rail family. It is endemic to New Zealand. Some authorities consider it as the only extant member of the genus '' Gallirallus''. ...
. MOA is
free software
Free software, libre software, libreware sometimes known as freedom-respecting software is computer software distributed open-source license, under terms that allow users to run the software for any purpose as well as to study, change, distribut ...
released under the
GNU GPL
The GNU General Public Licenses (GNU GPL or simply GPL) are a series of widely used free software licenses, or ''copyleft'' licenses, that guarantee end users the freedom to run, study, share, or modify the software. The GPL was the first ...
.
See also
ADAMS Workflow Workflow engine for MOA and Weka
Streams Flexible module environment for the design and execution of data stream experiments
*
Vowpal Wabbit
*
List of numerical analysis software
Listed here are notable end-user computer applications intended for use with numerical or data analysis:
Numerical-software packages
* Analytica is a widely used proprietary software tool for building and analyzing numerical models. It is a de ...
References
{{Reflist
External links
MOA Project home page at University of Waikato in New Zealand
SAMOA Project home page at Yahoo Labs
Data mining and machine learning software
Free science software
Java (programming language) software
Free data analysis software