Rule Extraction
   HOME

TheInfoList



OR:

Rule induction is an area of
machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
in which formal rules are extracted from a set of observations. The rules extracted may represent a full
scientific model Scientific modelling is an activity that produces models representing empirical objects, phenomena, and physical processes, to make a particular part or feature of the world easier to understand, define, quantify, visualize, or simulate. It ...
of the data, or merely represent local
patterns A pattern is a regularity in the world, in human-made design, or in abstract ideas. As such, the elements of a pattern repeat in a predictable manner. A geometric pattern is a kind of pattern formed of geometric shapes and typically repeated li ...
in the data.
Data mining Data mining is the process of extracting and finding patterns in massive data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and ...
in general and rule induction in detail are trying to create algorithms without human programming but with analyzing existing data structures. In the easiest case, a rule is expressed with “if-then statements” and was created with the
ID3 algorithm In decision tree learning, ID3 (Iterative Dichotomiser 3) is an algorithm invented by Ross QuinlanQuinlan, J. R. 1986. Induction of Decision Trees. Mach. Learn. 1, 1 (Mar. 1986), 81–106 used to generate a decision tree from a dataset. ID3 is th ...
for decision tree learning. Rule learning algorithm are taking training data as input and creating rules by partitioning the table with
cluster analysis Cluster analysis or clustering is the data analyzing technique in which task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more Similarity measure, similar (in some specific sense defined by the ...
. A possible alternative over the ID3 algorithm is genetic programming which evolves a program until it fits to the data. Creating different algorithm and testing them with input data can be realized in the WEKA software. Additional tools are machine learning libraries for
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (prog ...
, like
scikit-learn scikit-learn (formerly scikits.learn and also known as sklearn) is a free and open-source machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support ...
.


Paradigms

Some major rule induction paradigms are: *
Association rule learning Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.P ...
algorithms (e.g., Agrawal) *
Decision rule In decision theory, a decision rule is a function which maps an observation to an appropriate action. Decision rules play an important role in the theory of statistics and economics, and are closely related to the concept of a strategy in game ...
algorithms (e.g., Quinlan 1987) *
Hypothesis testing A statistical hypothesis test is a method of statistical inference used to decide whether the data provide sufficient evidence to reject a particular hypothesis. A statistical hypothesis test typically involves a calculation of a test statistic. T ...
algorithms (e.g., RULEX) *
Horn clause In mathematical logic and logic programming, a Horn clause is a logical formula of a particular rule-like form that gives it useful properties for use in logic programming, formal specification, universal algebra and model theory. Horn clauses are ...
induction * Version spaces *
Rough set In computer science, a rough set, first described by Polish computer scientist Zdzisław I. Pawlak, is a formal approximation of a crisp set (i.e., conventional set) in terms of a pair of sets which give the ''lower'' and the ''upper'' approxim ...
rules *
Inductive Logic Programming Inductive logic programming (ILP) is a subfield of symbolic artificial intelligence which uses logic programming as a uniform representation for examples, background knowledge and hypotheses. The term "''inductive''" here refers to philosophical ...
*Boolean decomposition (Feldman)


Algorithms

Some rule induction algorithms are: *CharadeSahami, Mehran.
Learning classification rules using lattices
" Machine learning: ECML-95 (1995): 343-346.
*Rulex *
Progol Progol is an implementation of inductive logic programming that combines inverse entailment with general-to-specific search through a refinement graph. Features Inverse entailment is used with mode declarations to derive the bottom clause, ...
* CN2


References

* Machine learning Inductive reasoning {{Comp-sci-stub