In this blog, I am going to analyze the one of the common
software used in the data mining area which is called KEEL. KEEL was created
based on Java language comparing with the Orange which is created based on the
C++ and Python. KEEL was made by University of Granada supported by Spain
International Science Projects Organization. This software was coded by Java
language. KEEL is used not only for data mining software but also for data
mining education demos.
KEEL allow to use some hybrid algorithms using Fuzzy Logic and
Genetic and Fuzzy Neural Networks in addition to allow to use common data
mining algorithms such as clustering algorithm.
Data Structures
KEEL, has a capacity to import from various types of data
such as CSV, TXT, PRN, XLS, DIF, XML, and HTML. In addition to those, it
supports to import data from SQL database and some other data mining software
such as WEKA.
Preprocessing
KEEL allows to use many data discretize and feature
selection algorithms, and additionally it also supports to use data transformation
tools such as MinMax, Z-score and Decimal Scaling. By the way, it has some
other additional tools to use for some data sets including missing values.
Figure 1- Data preprocessing panel on KEEL Figure 2-Data mining algorithms
on KEEL
Data Mining
Algorithms
KEEL does not have a large clustering and classification
conventional data mining algorithms. However, it has various types algorithms based
on fuzzy intelligence based classification and rule based clustering algorithms.
Data Stream Design
KEEL uses the click-add method to add some data objects
instead of using the drag-add method. As shown Figure-2, first user needs to
choose the method then add the method to the design canvas. After doing that,
user needs to complete the data streaming using the arrow symbol placed on the
left panel shown Figure-2 connecting 2 objects. Figure-3 shows how to create a
data stream for K-nearest Neighbor algorithm using KEEL.
Figure
3-Canvas structure on KEEL Figure 4-Reports on the KEEL
Visualization
KEEL has been observed as a lack of visualization design for
the most important phase of the decision making phase. It visualizes the
results only as a tabloid structure for the users.
Erhan,
ReplyDeleteI have not heard about KEEL before. Thank you for pointing it out to us.
It would be interesting if someone starts performing some software comparisons using toy problems to see their performance, i.e. how much time do they take to solve several data mining problems.
Additional references for everybody:
http://www.keel.es/
http://sci2s.ugr.es/keel/datasets.php
Fadel