Friday, February 22, 2013

Visualization Importance and How-To in Orange


Importance

Now that we know a little bit about how to mine data, we need to know how to interpret it. Our course name is “Analytics and Visualization of Big Data”. While we now know a little bit about how to analyze big data, it is important that start thinking about how to visualize it, especially with our visualization project coming up!

There are so many products out there that assist in analyzing data visually. I feel like so many people claim to be “visual learners” and that makes sense. Our brains are naturally good at visual analysis. One study even estimates that the optic nerve that sends data to our brains operates at 9Mb/sec. Visualizing data is key in giving us knowledge quickly, efficiently, and effectively. Being able to present data in a way that is appealing to the eye not only helps the analyzer to understand the data they are mining, but also helps other people (who aren’t as smart to understand all that we do) to get the big picture.



How-To in Orange

I will post a few screen shots about how to visualize one data set in two different ways using Orange data mining software. Maybe this quick tutorial will show you how to quickly analyze some data you might have!

I know that Patrick showed us in class how to use Orange, but this tutorial mainly goes to show that the same data can be viewed in a different light depending on the visualization tool that you choose to use. (I will even use the same animal data set that we did in class.)


File:
--First you need to load a file for Orange to read information from.  From the Data tab, drag and drop the ‘File’ icon. Double click the ‘File’ icon to search for a file. The file used here will be called ‘zoo.tab.tab’.




Attribute Statistics:
--Next, go to the Visualize tab. The ‘Attribute Statistics’ icon will allow you to see basic statistics about the data. Drag and drop this icon to the window and draw a line between the ‘File” icon and the ‘Attribute Statistics’ icon. Double click on the icon to see the attribute statistics.




Scatterplot:
--A scatterplot would be my first choice of a way to visualize data because it is something that I am familiar with. Drag and drop the ‘Scatterplot’ icon and draw a line between ‘File’ and ‘Scatterplot’. Double click the icon to see the plot. You are able to choose what attributes are for the x-axis and y-axis. In this case, I chose to observe fins vs. eggs.




Mosaic Display:
--Another visualization tool is a mosaic display. I didn’t really know how this worked before playing around with Orange, but it’s pretty cool. Drag, drop, and connect the ‘Mosaic Display’ icon to the ‘File’ icon just as before. Again, I want to compare fins vs. eggs. The chart might look overwhelming at first, but if you roll your mouse over a certain area, it shows details of what the color blocks are representing.



Scatterplot vs. Mosaic Display:
I chose for each of these visualization tools to represent the same attributes. For both of these, the colors represent the type of animal. So when you compare the two, the colors look like they are in similar areas. However, I find that the Mosaic Display shows the information much more clearly than the Scatterplot does. There are so many dots in the same spot on the scatterplot that it is hard to understand what they all mean. I encourage you to play around with these tools to see what works best for your data set.




These are just a couple of ways to visualize the same data. There are so many options within the Orange program to visualize data. Not to mention, there are so many online tools that can be utilized. I would love for someone to do a tutorial on another program that I could learn to use! It is so important that we all understand the different visualization methods!


Sources:

1 comment:

  1. Brianna,

    Thank you for putting this together. Very helpful to your colleagues and to people around the world I am sure.

    War Eagle!!
    Fadel

    ReplyDelete