After Hadoop revolutionized the way
web services analyzed data, there are those who are setting their sights on
making Hadoop perform more like traditional database software. While Hadoop
excels at handling large amounts and less relational data that database
software struggles with, Hadoop takes significantly longer to actually address
the data in any meaningful way. A straightforward query might take a few
minutes whereas in a relational database it would take only a few seconds. To
remedy this multiple startups are implementing the structured query language
(SQL) into Hadoop with significantly shorter processing times as a major
feature. One major company that is attacking this problem is Greenplum.
Greenplum recently announced their own version of Hadoop, Pivotal HD, which
increases performance, and accepts SQL queries. Database giant Oracle also
sells its own version of Hadoop. Cloudera, a startup founded by the former
Facebook employee who chose to implement Hadoop as Facebook’s preeminent data
management and analysis system. Cloudera’s implementation of SQL queries within
Hadoop is significantly faster than Hadoop itself, and Greenplum’s is faster
still, but they do have some weaknesses when compared to Hadoop. In Pivotal HD,
if a machine fails during a query the query is stopped and must be completely
restarted, a problem that Hadoop does not share. This type of issue could
create problems on larger networks, where machine failures would be more
frequent. These improvements on Hadoop do showcase a big positive to Hadoop’s
open source architecture, as they would not be possible with a closed source
system.
Source: http://www.wired.com/wiredenterprise/2013/02/pivotal-hd-greenplum-emc/
No comments:
Post a Comment