Wednesday, April 24, 2013

Privacy in the Big Data era

We already have mountains of information in a variety of forms of data, such as plain texts in social media, spreadsheet form data about patients, and massive database provided publicly. When this kind of data is used, de-identification has been very crucial in order to prevent individuals from being victims of identity theft or from involving other type of crime. However, as the power of data processing drastically improves, re-identification is not impossible by analyzing the pattern of individuals' behavior. It seems very natural that many people concern the danger of development of big data technology.

Here is a paper that delivers the authors' thoughts on privacy in the Big Data time. 

Big Data: Big Benefits

Google Flu Trends is a good example that can show the benefit of Big Data. It provides a service that predicts and locates outbreaks of the flu by making use of information - aggregate search engines. This service, early detection of disease, when followed by rapid response, can reduce the impact of both seasonal and pandemic influenza.

Traffic management and control is a field witnessing significant data-driven environmental innovation. By using electronic toll pricing systems, drivers pay depending on their use of vehicles and roads. Also, this management and control enables governments to potentially cut congestion and the emission of pollutants.

Big Data: Big Concerns

However, the harvesting of large data sets and the use of analytics implicate privacy concerns. Ensuring data security and protecting privacy become harder as information is multiplied and shared ever more widely around the world. If de-identification becomes a key component of business models, most notably in the contexts of health data, online behavioral advertising, and cloud computing, governments and businesses could be in more trouble.

What data is "Personal?"

It seems that there is no common idea even in the group of law scholars. Quoted Betsy Masiello and Alma Whitten, 
"anonymized information will always carry some risk of re-identification. many of the most pressing privacy risks exist only if there is certainty in re-idenfication, that is if the information can be authenticated. As uncertainty is introduced into the re-identification equation, we cannot know that the information truly corresponds to a particular individual; it becomes more anonymous as larger amounts of uncertainty are introduced."

The authors did not present some tangible conclusion. Of course, this debate will be continuing. I think that the obvious thing on this debate is that attempts to harvest privacy data will be existing and counteraction against the attempts will also be deploying.

Reference: http://www.stanfordlawreview.org/online/privacy-paradox/big-data

No comments:

Post a Comment