Thursday, March 21, 2013

In which phases of data mining is the fuzzy logic frequently used?


Following to my previous post on application of fuzzy logic in data mining, in this post the different phases of data mining process that fuzzy logic has been applied successfully, are reviewed. Bai et al [1] believes that the fuzzy logic has been applied frequently in four different phases of data mining process. These stages are described in the following:

1- Problem understanding phases
In these phases, the goals of data mining are defined and the required data is gathered. In these phases, fuzzy set methods can be used to formulate, for example, the background domain knowledge in vague manner, which can be used for the subsequent modeling phase. Moreover, fuzzy database queries are useful for finding the data needed and to check whether it may be useful to take additional related data into consideration.

2- Data preparation step
Fuzzy method can be used to detect outliers. For example, fuzzy clustering clusters the data and then finds those data points which are far away from the cluster prototypes. Moreover, selecting and extracting useful attributes of target objects from most of the databases are difficult because all of the attributes needed for successful extraction cannot be found explicitly in the databases. In these cases, domain knowledge and user analysis becomes a necessity and techniques such as neural networks tend to produce poor results since the domain knowledge cannot be incorporated into the neural networks (Lu et al. 1996) [2]. Fuzzy logic based models utilize the domain knowledge in coming up with rules of data selection and extraction.

3- Modeling phase
Fuzzy data analysis approaches can be applied to predict future developments or to build classifiers. One kind of application is to analyze fuzzy data, which are derived from imprecise measurement instruments or from the descriptions of human domain experts. The other kind of   applications consists of methods that use fuzzy techniques to structure and analyze crisp data. When the data set contains qualitative data, finding associations are very difficult by applying conventional rule induction algorithms. Since fuzzy logic modeling is a probability based modeling, it has many advantages over the conventional rule induction algorithms. These advantages are summarized in Maeda et al. [3]. The first advantage is that it allows processing of very large data sets which require efficient algorithms. Fuzzy logic-based rule induction can handle noise and uncertainty in data values well.

4- Evaluation phase
Fuzzy modeling methods are interpretable systems. Therefore, they can easily be checked for plausibility against the intuition and expectations of human experts. The results can also provide unknown knowledge for new consideration.


References:
  1. Ying Bai, Hanqi Zhuang and DaliWang, Advanced Fuzzy Logic Technologies in Industrial Applications, Springer, 2006.
  2. Lu, H, R. Setiono, and H. Liu, Effective Data Mining Using Neural Networks, IEEE Transactions on Knowledge and Data Engineering, Vol. 8, No. 6. 1996.
  3. Maeda, A., Ashida, H., Taniguchi, Y. and Takahashi, Y. Data Mining System using Fuzzy Rule Induction, Proceedings of 1995 IEEE International Conference on Fuzzy Systems, 1995.



No comments:

Post a Comment