There is an alternative approach to rule induction that avoids
global optimization but nevertheless produces accurate, compact rule sets. The
method combines the divide-and-conquer strategy for decision tree learning with
the separate-and-conquer one for rule learning. It adopts the
separate-and-conquer strategy in that it builds a rule, removes the instances
it covers, and continues creating rules recursively for the remaining instances
until none are left.
However, it differs from the standard approach in the way that each
rule is created. In essence, to make a single rule a pruned decision tree is
built for the current set of instances, the leaf with the largest coverage is
made into a rule, and the tree is discarded.
The prospect of repeatedly building decision trees only to discard
most of them is not as bizarre as it first seems. Using a pruned tree to obtain
a rule instead of pruning a rule incrementally by adding conjunctions one at a
time avoids a tendency to over prune, which is a characteristic problem of the
basic separate-and-conquer rule learner. Using the separate-and-conquer
methodology in conjunction with decision trees adds flexibility and speed. It
is indeed wasteful to build a full decision tree just to obtain a single rule,
but the process can be accelerated significantly without sacrificing the
advantages. The key idea is to build a partial decision tree instead of a fully
explored one. A partial decision tree is an ordinary decision tree that
contains branches to undefined sub trees. To generate such a tree, the
construction and pruning operations are integrated in order to find a “stable” sub
tree that can be simplified no further. Once this sub tree has been found, tree
building ceases and a single rule is read off.
The tree-building algorithm is summarized in below Figure: It
splits a set of instances recursively into a partial tree. The first step
chooses a test and divides the instances into subsets accordingly. The choice
is made using the same information-gain heuristic that is normally used for
building decision trees. Then the subsets are expanded in increasing order of
their average entropy. The reason for this is that the later subsets will most
likely not end up being expanded, and a subset with low average entropy is more
likely to result in a small sub tree and therefore produce a more general rule.
This proceeds recursively until a subset is expanded into a leaf, and then
continues further by backtracking. But as soon as an internal node appears that
has all its children expanded into leaves, the algorithm checks whether that
node is better replaced by a single leaf. This is just the standard sub tree
replacement operation of decision tree pruning. If replacement is performed the
algorithm backtracks in the standard way, exploring siblings of the newly
replaced node. However, if during backtracking a node is encountered all of
whose children expanded so far are not leaves—and this will happen as soon as a
potential sub tree replacement is not performed—then the remaining
subsets are left unexplored and the corresponding sub trees are left undefined.
Due to the recursive structure of the algorithm, this event automatically
terminates tree generation.
References:
Witten,
I. H., Frank, E., Hall, M. A., (2011), Data Mining: Practical Machine Learning Tools
and Techniques, the Morgan Kaufmann Series in Data Management Systems.
great intro to machine learning with decision trees. This is helpful to start doing image recognition and prevent machine learning with decision trees and random forests book at a large scale. I appreciate your book referral too. Peter Harrington is a pro.
ReplyDeleteA decision tree is a visual model for decision making which represents consequences, including chance event outcomes, resource costs, and utility. It is also one way to display an algorithm that only contains conditional control statements. Making decision trees are super easy with a decision tree maker with free templates.
ReplyDelete