As of the end of 2012, companies have spent $4.3 billion on
Big Data technologies. The article
claims that an estimated $232 billion will be spent over the next five years on
these technologies.
There are over 250,000 viable open source solutions available.
TechCrunch did the researching and testing and has presented the newest class
of tools that people should not look over.
Storm and KAFKA
Storm is a “distributed real-time computation system.” It
does for real-time processing what Hadoop did for batch processing. Kafka
serves as the foundation for activity stream and the data processing pipeline
behind it. Together, one gets the stream in real time at a linear scale.
Benefits of pairing the two:
- Handles velocities of tens of thousands of messages per second
- Superior approach to ETL and data integration
- Great in-memory analytics and real-time decision support
Drill and Dremel
Drill is the open source version of what Google has done
with their Dremel. They large-scale, ad-hoc querying of data possible, much
like Hadoop. Data scientists are
speculating that Drill and Dremel may actually be better than Hadoop in the
wider sense; replacement even.
R
R is an open source statistical programming language that is
quickly becoming the new standard for statistics. With their strong community
and daily innovation, it has become one of the best places to be in Big Data
currently. Pairing it with Hadoop is a wonderful way to future-proof your Big
Data program.
Gremlin and Giraph
Gremlin and Giraph , paired with graph databases, empower
graph analysis, which allows a different approach from a relational approach.
Gremlin and Giraph are open source alternatives to Google’s Pregel.
The article mentions SAP HANA as well, but upon further investigation,
it isn’t a true open source solution.
Article: http://techcrunch.com/2012/10/27/big-data-right-now-five-trendy-open-source-technologies/
There is no doubt that open source solutions for big data analysis are powerful alternatives to closed source software and can save companies quite a bit of money. However in certain fields where security is a priority, any software used by employees must undergo rigorous security certifications.
ReplyDeletePro Open Source:
A major benefit of open source software is that the code available for anyone to examine, and therefore, the increase in the number of eyes examining it can lead to potential security issues being exposed earlier. Most software security professionals will tell you that the worst kind of security loophole is one that a hacker knows about but the IT people do not. The increased visibility of the code and higher number of code editors can potentially decrease the time it takes to patch the software. This narrows the window considerably in which hackers can take advantage of any flaws in the program.
Pro-Closed Source:
As opposed to open source software, the code for closed source software is rarely seen by its consumers. This can have both positive and negative side effects. On one hand, the fact that the code is hidden from the public can delay potential loopholes from being discovered. However, on the other hand, there are fewer “honest” eyes examining the code. This can increase the chance of security issues going unnoticed. To counteract this problem, large companies that produce closed source often have thorough and documented quality processes as well the cash to back them up.
Both open and closed source have their pros and cons. Open source code can potentially have many more programmers troubleshooting and improving code, while closed source software can pay for the support of highly talented programmers and bring a trusted reputation to security minded customers.
Source:
http://www.digitalcommunities.com/articles/Is-Open-Source-Software-More-Secure.html?page=1
Alex and Sam,
ReplyDeleteBrianna posted this article a few days earlier. Can you please highlight the value added in terms of this post? I know you are not addressing the same software necessarily, but the idea is similar. It would be very interesting to see more discussion in this area.
Fadel
nice and imformative blog
ReplyDeletebig data and bi solutions
Nice and informative blog
ReplyDelete