Following
to my previous post on application of fuzzy logic in data mining, in
this post the different phases of data mining process that fuzzy logic has been
applied successfully, are reviewed. Bai et al [1] believes that the
fuzzy logic has been applied frequently in four different phases of data mining
process. These stages are described in the following:
1- Problem
understanding phases
In
these phases, the goals of data mining are defined and the required data is
gathered. In these phases, fuzzy set methods can be used to formulate, for
example, the background domain knowledge in vague manner, which can be used for
the subsequent modeling phase. Moreover, fuzzy database queries are useful for
finding the data needed and to check whether it may be useful to take
additional related data into consideration.
2- Data
preparation step
Fuzzy
method can be used to detect outliers. For example, fuzzy clustering clusters
the data and then finds those data points which are far away from the cluster
prototypes. Moreover, selecting and extracting useful attributes of target
objects from most of the databases are difficult because all of the attributes
needed for successful extraction cannot be found explicitly in the databases.
In these cases, domain knowledge and user analysis becomes a necessity and
techniques such as neural networks tend to produce poor results since the
domain knowledge cannot be incorporated into the neural networks (Lu et al. 1996)
[2]. Fuzzy logic based models utilize the domain knowledge in coming up with
rules of data selection and extraction.
3- Modeling
phase
Fuzzy
data analysis approaches can be applied to predict future developments or to
build classifiers. One kind of application is to analyze fuzzy data, which are
derived from imprecise measurement instruments or from the descriptions of
human domain experts. The other kind of applications consists of methods that use
fuzzy techniques to structure and analyze crisp data. When the data set contains
qualitative data, finding associations are very difficult by applying
conventional rule induction algorithms. Since fuzzy logic modeling is a
probability based modeling, it has many advantages over the conventional rule
induction algorithms. These advantages are summarized in Maeda et al. [3].
The first advantage is that it allows processing of very large data sets which
require efficient algorithms. Fuzzy logic-based rule induction can handle noise
and uncertainty in data values well.
4- Evaluation
phase
Fuzzy
modeling methods are interpretable systems. Therefore, they can easily be
checked for plausibility against the intuition and expectations of human
experts. The results can also provide unknown knowledge for new consideration.
References:
- Ying Bai, Hanqi Zhuang and DaliWang, Advanced Fuzzy Logic Technologies in Industrial Applications, Springer, 2006.
- Lu, H, R. Setiono, and H. Liu, Effective Data Mining Using Neural Networks, IEEE Transactions on Knowledge and Data Engineering, Vol. 8, No. 6. 1996.
- Maeda, A., Ashida, H., Taniguchi, Y. and Takahashi, Y. Data Mining System using Fuzzy Rule Induction, Proceedings of 1995 IEEE International Conference on Fuzzy Systems, 1995.
No comments:
Post a Comment