In almost all businesses that deal with large volumes of
data, IT departments are starting to deal with issues around big-data deployments. However, one issue
that is starting to concern IT more and more is security, especially as
big-data analysis usually requires access to thousands of pieces of personal
information, including social security and credit card numbers.
IT already knows that these massive
datasets can cause problems. In fact, 80 percent of Apache Hadoop users want to
know if there is sensitive data stored in their environment, while 77 percent
know it's important to protect sensitive data within a big-data deployment and
control who has access to that data.
These and other findings come from a new survey released earlier this month by
Dataguise, a company that makes security intelligence and other data protection
tools. The report involved more than 60 different enterprise users who attended
either the recent RSA conference or the O'Reilly Strata Conference.
The big-data security report
specifically focused on Hadoop users and not other types of
big-data environments such as Riak.
IT departments that use Hadoop
should be aware of storing data from several different sources as part of their
big-data analysis since this can lead to a number of unforeseen problems and
security issues.
Since there are no easy answers,
it's at least best to stay aware of what your company is collecting.
The challenge for IT is keeping
track of how other departments, such as marketing, sales, and other divisions,
are using big-data and what information and datasets they want analyzed as part
of their project. According to the report, 33 percent of businesses store
sensitive data within their Hadoop environment, including social security and
credit card information.
What other types of data are within
these Hadoop environments? About 55 percent of participants reported that their
company is storing log files, while 36 percent store some type of structured
database management system (DBMS) data, and another 24 percent have mixed data
types.
The Dataguise report does offer some
practical, if rather simple, advice for those IT departments dealing with
big-data deployments and who want to ensure that the privacy of the data they
are using is protected and within compliance. These include:
- Making sure that IT managers have the ability to locate
and identify sensitive data across different big-data clusters, so that
they can inform management of any potential risk.
- IT should make sure that any security tools, including
data masking and data quarantine, remain a priority.
- Finally, IT should make sure their big-data environment
can be centrally managed, with scheduled detection and protection features
deployed throughout the clusters that ensure the environment meets
compliance rules and regulations.
No comments:
Post a Comment