Tuesday, February 19, 2013

Big Data and Cancer


We store the data we generate from our everyday life in a hard disk which is pretty bulk and it can be damaged easily. Imagine storing one exabyte of data (1 exabyte equals a thousand petabytes or a million terabytes or a billion gigabytes).  Bioengineers and geneticists at Harvard University's Wyss Institute claim that data up to one exabyte of memory can be stored in a single gram of human DNA and they've proved that 700 TB of data can be stored in a single gram of human DNA. A human DNA which is made up of three billion base pairs is an ideal source of big data. So what can be done with this humungous data?
Cancer is a disease for which cure is yet to be found. Cancer is not of the same pattern in every individual. It requires personalized treatment as it reacts in unexpected ways with the genetic code. So collecting the data regarding analysis of cancer genomes from cancer patients at multiple stages of cancer development, documenting them and identifying the genes which are responsible for which type of cancer is one way to treat cancer. But the data obtained from a million cancer patients alone would occupy the memory of all of YouTube’s videos. Researchers claim that with this kind of data, certain types of cancers in its acute stage can be converted into a chronic disease. This data can help pharmaceutical companies develop drugs quicker and less costly for a specific type of genetic code and labs can use this data to get deeper and better analysis of the cancer with limited tests. One of the challenges which have to be faced while handling genetic data is piracy. For sharing this data among researchers, doctors and pharmacologists regulatory rules have to be strict.
MediSapiens - a Finnish company hosts the world’s largest unified gene expression database and provides oncologists to cross-reference 19,000 genes across 20,000 patients. Handling of genetic data analysis data requires a more stable structure and organization. Apache Hadoop’s support of parallelization offers good composability and maps genomic problems through MapReduce.
Many companies have come forward to treat cancer. Verizon has partnered with NantWorks - a company which specializes in managing and analyzing genomic and caner-related data. NantWorks will be using Verizon’s 4G LTE (Long-Term Evolution) network and cloud infrastructure for their research. In 2011 Dell donated its cloud infrastructure to fund Translational Genomics Research Institute’s pediatric cancer research. IBM’s supercomputer Watson is capable of searching through 60,000 pieces of medical evidence, 2 million pages of text from 42 medical journals and clinical trials in oncology research and 1.5 million patient records to recommend the apt treatment to healthcare providers. At Memorial Sloan-Kettering Cancer Center, Watson is taught how to process, analyze and interpret clinical information to improve the treatment of lung cancer.  In the near future, with the help of big data and cloud computing, cancer can be turned into a chronic disease.

Reference



No comments:

Post a Comment