Sunday, February 24, 2013

Analysis of KEEL as Data Mining Software




In this blog, I am going to analyze the one of the common software used in the data mining area which is called KEEL. KEEL was created based on Java language comparing with the Orange which is created based on the C++ and Python. KEEL was made by University of Granada supported by Spain International Science Projects Organization. This software was coded by Java language. KEEL is used not only for data mining software but also for data mining education demos.
KEEL allow to use some hybrid algorithms using Fuzzy Logic and Genetic and Fuzzy Neural Networks in addition to allow to use common data mining algorithms such as clustering algorithm.
Data Structures
KEEL, has a capacity to import from various types of data such as CSV, TXT, PRN, XLS, DIF, XML, and HTML. In addition to those, it supports to import data from SQL database and some other data mining software such as WEKA.
Preprocessing
KEEL allows to use many data discretize and feature selection algorithms, and additionally it also supports to use data transformation tools such as MinMax, Z-score and Decimal Scaling. By the way, it has some other additional tools to use for some data sets including missing values.

                                            
Figure 1- Data preprocessing panel on KEEL               Figure 2-Data mining algorithms on KEEL

Data Mining Algorithms  
KEEL does not have a large clustering and classification conventional data mining algorithms. However, it has various types algorithms based on fuzzy intelligence based classification and rule based clustering algorithms.
Data Stream Design
KEEL uses the click-add method to add some data objects instead of using the drag-add method. As shown Figure-2, first user needs to choose the method then add the method to the design canvas. After doing that, user needs to complete the data streaming using the arrow symbol placed on the left panel shown Figure-2 connecting 2 objects. Figure-3 shows how to create a data stream for K-nearest Neighbor algorithm using KEEL.

           
         Figure 3-Canvas structure on KEEL                             Figure 4-Reports on the KEEL

Visualization
KEEL has been observed as a lack of visualization design for the most important phase of the decision making phase. It visualizes the results only as a tabloid structure for the users. 

1 comment:

  1. Erhan,

    I have not heard about KEEL before. Thank you for pointing it out to us.

    It would be interesting if someone starts performing some software comparisons using toy problems to see their performance, i.e. how much time do they take to solve several data mining problems.

    Additional references for everybody:
    http://www.keel.es/
    http://sci2s.ugr.es/keel/datasets.php

    Fadel

    ReplyDelete