Wednesday, March 20, 2013

Fuzzy Logic in Data Mining


Bai et al [1] assigned a chapter of their book to briefly introduce the application of fuzzy logic in data mining. Their book aims to analyze the advanced fuzzy logic technologies in industrial applications and in chapter 17th, they reviewed different areas in data mining in which fuzzy logic techniques provides more understandable and applicable results. In this post I briefly review their introduction to the application of fuzzy logic in data mining.

Data mining is a process of extraction of hidden, previously unknown and potentially useful information from a set of data. The data mining models are generally implemented by combining a set of techniques to extract the useful relationships among the data. Zaiane [2] and Bruce Moxon [3] described the data mining functionalities and knowledge briefly as characterization, discrimination, association, sequence based analysis, classification, clustering, prediction, Outlier analysis, evolution and deviation analysis, and estimation. These techniques are able to solve certain types of problems.

There are other data mining techniques including case-based reasoning, genetic algorithms, and fuzzy logic. Each of these has its own strengths and weakness in terms of problem types addressed, performance and complexity.

The techniques studied in this area have mainly been focused on highly structured and precise data. In addition, some of these techniques are highly mathematical and are quantitative in nature and therefore, the goal of obtaining understandable results is often ignored. To exploit fully all the attributes of an object present in the data set, one must use the qualitative attributes. The analysis of heterogeneous information sources with the prominent aim of producing comprehensible results is a new challenge in data mining research. Fuzzy logic is an extraordinarily valuable tool for representing and manipulating all kinds of data in qualitative/linguistic terms and for achieving understandable solutions.

It is undisputed that language is a most effective human tool to structure experience and to model environment. What Zadeh proposed as computing with words indicated a new direction in data mining technologies. Linguistic terms are vague in nature, i.e., they have “fuzzy” boundaries. The reason for this inherent vagueness is that for practical purpose full precision is not necessary and may even be a waste of resources. Fuzzy set theory provides excellent tools to model the “fuzzy” boundaries of linguistic terms by introducing gradual membership. In classical set theory, an object is either a member of a given set or not. Member degrees of fuzzy sets include similarity, preference, and uncertainty. They can state how similar an object or case is to a prototypical one, they can indicate preferences between suboptimal solutions to a problem, or they can model uncertainty about the real life situation if the scenario is described in an imprecise manner. Thanks to their closeness to human reasoning, a solution obtained using fuzzy approaches is easy to understand and to apply. Fuzzy systems are therefore good candidate to choose, if linguistic, vague, or imprecise information has to be modeled and analyzed.


References:
  1. Ying Bai, Hanqi Zhuang and DaliWang, Advanced Fuzzy Logic Technologies in Industrial Applications, Springer, 2006.
  2. Osmar R. Zaïane, Simeon J. Simoff, Chabane Djeraba, Mining Multimedia and Complex Data, published by Springer Verlag, Lecture Notes in Artificial Intelligence Volume 2797, 2003, ISBN:3-540-20305-2.
  3.  Bruce Moxon. Defining Data Mining. DBMS Data Warehouse Supplement, August 1996.


2 comments: