We have heard about how Target and other companies use Big Data
analytics to pinpoint their customers’ interests. Web services used worldwide have
to be sure to provide quick results, or else the customers will bail on the
site. That is exactly the motivation of one company as they expand in the Big
Data community.
A large German travel company needed help providing quick answers. By
quick, the agency meant that they wanted to provide an answer to their customer
in a second or less because that is the amount of time that they figured was
allotted until the customer decided to go to another site. Intuitively, as more
time passes, more customers are lost. At the time the company started looking
for help with their site, the fastest they could build was 6.5 seconds on 200
million records. At 6.5 seconds, the company was operating entirely too slow,
and not to mention on too many computing machines. Big Data solutions, such as
Hadoop, columnar database technology, Oracle, and FAST from Microsoft were not
cutting it. Hadoop, who we keep hearing about as the latest and greatest, wasn’t
doing the job!
The travel company decided to build their own method of data
processing because they couldn’t afford all of the machines needed to run the
other systems. The new method started with data structures, algorithms,
indexing, and continuous loading of new data. The company that would represent
this product would be ParStream. With ParStream’s advancements, the travel
company is now able to handle 1,000 queries per second, rummage through 18
billion offers with 20 parameters all to give a response in (the less than
desired) one second. This is achieved by CPUs combined with Nvidia’s Fermi GPU
processors. ParStream technology allows for the same amount of processing with
the fraction of the machines.
Michael Hummel, who has been involved as a manager along the way
stated, “Nobody wants to wait for results. Most
people think big data is billions of records, but static. That is completely
wrong. Big data is dynamic. New data is created every second and you have to
take this new data and process it together with historical data.”
Hummel, like most consumers, believes that faster is better. Something
that can keep up with real time changes in data is especially innovative. That
is what ParStream is able to do. Changes happen so often, especially on the
internet, and this technology is able to keep up with these changes, unlike
MapReduce technology.
MapReduce has been a popular term since Google made it popular in 2005, but now even Google is saying that MapReduce isn’t quite up to par. In fact, Google is now using Caffeine and Dremel for Big Data analytics. Ironic, right? This makes for an interesting perspective about Hadoop. People are obviously still utilizing MapReduce and Hadoop, but from what it sounds like, ParStream has a leg up on them. ParStream claims “ParStream has already been used to replace Hadoop clusters, with a better efficiency ratio of 10 to 20 (less nodes) and a better effectiveness ratio of the same order of magnitude (in query response time).” While ParStream would not be able to replace all the functions of MapReduce, it is a Big Data analytics tool to keep our ears open for. Will “ParSteam” become a household term, or will something else in the Big Data world be bigger?
***Disregard that this is an ad for AT&T and that these commercials are on TV all the time, but take it from this girl that faster is better.
MapReduce has been a popular term since Google made it popular in 2005, but now even Google is saying that MapReduce isn’t quite up to par. In fact, Google is now using Caffeine and Dremel for Big Data analytics. Ironic, right? This makes for an interesting perspective about Hadoop. People are obviously still utilizing MapReduce and Hadoop, but from what it sounds like, ParStream has a leg up on them. ParStream claims “ParStream has already been used to replace Hadoop clusters, with a better efficiency ratio of 10 to 20 (less nodes) and a better effectiveness ratio of the same order of magnitude (in query response time).” While ParStream would not be able to replace all the functions of MapReduce, it is a Big Data analytics tool to keep our ears open for. Will “ParSteam” become a household term, or will something else in the Big Data world be bigger?
***Disregard that this is an ad for AT&T and that these commercials are on TV all the time, but take it from this girl that faster is better.
Sources:
Brianna,
ReplyDeleteAs you know this field has a very short memory and if you get our innovated, you are done. There will continue to be advancements in big data analytics and hopefully, we can make not only faster decisions, but also smarter ones using the existing platforms.
Big data is definitely not only about the volume and the velocity of data; there are other dimensions that make such problems more challenging (see http://www.sas.com/big-data/) for a nice introductory discussion on these dimensions.
Fadel