
Rule induction is an area of
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
in which formal rules are extracted from a set of observations. The rules extracted may represent a full
scientific model
Scientific modelling is an activity that produces models representing empirical objects, phenomena, and physical processes, to make a particular part or feature of the world easier to understand, define, quantify, visualize, or simulate. It ...
of the data, or merely represent local
patterns in the data.
Data mining
Data mining is the process of extracting and finding patterns in massive data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and ...
in general and rule induction in detail are trying to create algorithms without human programming but with analyzing existing data structures.
In the easiest case, a rule is expressed with “if-then statements” and was created with the
ID3 algorithm for decision tree learning.
Rule learning algorithm are taking training data as input and creating rules by partitioning the table with
cluster analysis
Cluster analysis or clustering is the data analyzing technique in which task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more Similarity measure, similar (in some specific sense defined by the ...
.
A possible alternative over the ID3 algorithm is genetic programming which evolves a program until it fits to the data.
Creating different algorithm and testing them with input data can be realized in the WEKA software.
Additional tools are machine learning libraries for
Python, like
scikit-learn
scikit-learn (formerly scikits.learn and also known as sklearn) is a free and open-source machine learning library for the Python programming language.
It features various classification, regression and clustering algorithms including support ...
.
Paradigms
Some major rule induction paradigms are:
*
Association rule learning
Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.P ...
algorithms (e.g., Agrawal)
*
Decision rule algorithms (e.g., Quinlan 1987)
*
Hypothesis testing
A statistical hypothesis test is a method of statistical inference used to decide whether the data provide sufficient evidence to reject a particular hypothesis. A statistical hypothesis test typically involves a calculation of a test statistic. T ...
algorithms (e.g., RULEX)
*
Horn clause
In mathematical logic and logic programming, a Horn clause is a logical formula of a particular rule-like form that gives it useful properties for use in logic programming, formal specification, universal algebra and model theory. Horn clauses are ...
induction
*
Version spaces
*
Rough set
In computer science, a rough set, first described by Polish computer scientist Zdzisław I. Pawlak, is a formal approximation of a crisp set (i.e., conventional set) in terms of a pair of sets which give the ''lower'' and the ''upper'' approxim ...
rules
*
Inductive Logic Programming
*Boolean decomposition (Feldman)
Algorithms
Some rule induction algorithms are:
*Charade
[Sahami, Mehran.]
Learning classification rules using lattices
" Machine learning: ECML-95 (1995): 343-346.
*Rulex
*
Progol
*
CN2
References
*
Machine learning
Inductive reasoning
{{Comp-sci-stub