Importance
Now that we know a little bit about how to mine data, we
need to know how to interpret it. Our course name is “Analytics and
Visualization of Big Data”. While we now know a little bit about how to analyze
big data, it is important that start thinking about how to visualize it,
especially with our visualization project coming up!
There are so many products out there that assist in
analyzing data visually. I feel like so many people claim to be “visual
learners” and that makes sense. Our brains are naturally good at visual
analysis. One study even estimates that the optic nerve that sends data to our
brains operates at 9Mb/sec. Visualizing data is key in giving us knowledge
quickly, efficiently, and effectively. Being able to present data in a way that
is appealing to the eye not only helps the analyzer to understand the data they
are mining, but also helps other people (who aren’t as smart to understand all
that we do) to get the big picture.
I will post a few screen shots about how to visualize one
data set in two different ways using Orange data mining software. Maybe this
quick tutorial will show you how to quickly analyze some data you might have!
I know that Patrick showed us in class how to use Orange,
but this tutorial mainly goes to show that the same data can be viewed in a
different light depending on the visualization tool that you choose to use. (I
will even use the same animal data set that we did in class.)
File:
--First you need to load a file for Orange to read
information from. From the Data
tab, drag and drop the ‘File’ icon. Double click the ‘File’ icon to search for
a file. The file used here will be called ‘zoo.tab.tab’.
Attribute Statistics:
--Next, go to the Visualize tab. The ‘Attribute Statistics’
icon will allow you to see basic statistics about the data. Drag and drop this
icon to the window and draw a line between the ‘File” icon and the ‘Attribute
Statistics’ icon. Double click on the icon to see the attribute statistics.
Scatterplot:
--A scatterplot would be my first choice of a way to visualize
data because it is something that I am familiar with. Drag and drop the
‘Scatterplot’ icon and draw a line between ‘File’ and ‘Scatterplot’. Double
click the icon to see the plot. You are able to choose what attributes are for
the x-axis and y-axis. In this case, I chose to observe fins vs. eggs.
Mosaic Display:
--Another visualization tool is a mosaic display. I didn’t
really know how this worked before playing around with Orange, but it’s pretty
cool. Drag, drop, and connect the ‘Mosaic Display’ icon to the ‘File’ icon just
as before. Again, I want to compare fins vs. eggs. The chart might look
overwhelming at first, but if you roll your mouse over a certain area, it shows
details of what the color blocks are representing.
Scatterplot vs.
Mosaic Display:
I chose for each of these visualization tools to represent
the same attributes. For both of these, the colors represent the type of
animal. So when you compare the two, the colors look like they are in similar
areas. However, I find that the Mosaic Display shows the information much more
clearly than the Scatterplot does. There are so many dots in the same spot on
the scatterplot that it is hard to understand what they all mean. I encourage
you to play around with these tools to see what works best for your data set.
These are just a couple of ways to visualize the same data.
There are so many options within the Orange program to visualize data. Not to
mention, there are so many online tools that can be utilized. I would love for
someone to do a tutorial on another program that I could learn to use! It is so
important that we all understand the different visualization methods!
Sources:
Brianna,
ReplyDeleteThank you for putting this together. Very helpful to your colleagues and to people around the world I am sure.
War Eagle!!
Fadel