Sunday, March 31, 2013

Using Data Mining Techniques to Predict the Survival Rate For Heart Transplants

Last week I posted a blog via which I introduced my research problem and gave some statistics about the current donated hearts and the gap.
This week I will continue to share my research with my classmates.
As I mentioned we focus on predicting how many years can a specific person live with the donated new heart.
In order to solve this problem , the first and the main problem is to determine the factors ( variables) affecting the result.

Conventionally, researchers have been dealing with small sets of dataset with using conventional statistical techniques which does not take collinearity and the nonlinearity into account,  as it was discusses in the previous blog. They use some non-parametrical and non-statistical techniques that are computationally expensive and need prior knowledge about the data .

The biggest advantage of todays world is there is a flood of big data in the health informatics that can be dealt with data mining techniques, which reveal better and more accurate solutions for the survival of organ transplant recipients than any of the conventional methods used by previous studies.  
We had started to do the research  by obtaining a very large dataset from UNOS, which is a tax-exempt, medical, scientific, and educational organization that operates the national Organ Procurement and Transplantation Network.The obtained dataset has  443 variables and 43000 cases which belong to the Heart Transplant Operations. These variables include the socio-demographic and health-related factors of both the donor and the recipients. There are also procedure-related factors among the dataset.

After preprocessing the data ( cleaning, dealing with the missing values, reorganizing the data for the specific studies etc), we used variable selection methods in order to determine the potential predictive factors. 
These potential predictive factors are the ones which have questioned whether they are predictive or not by using  some data mining algorithms such as Support Vector Machines, Decision Trees and Artificial Neural Networks.
After doing cross-tabulation and doing sensitivity analysis , we observed that all of these three methods gave pretty satisfactory results.
For 3 Years survival study, Support Vector Machines gave the best prediction rate by predicting 94.43 %  of the cases correctly, while artificial Neural Network 81.18 % and Decision Trees 77.65 % of them correctly.

What do these results mean ?
For support vector machine, the accuracy rate is 94.43 % , which means if Support Vector Machine is telling us that a specific person will live or die  if he/she gets the donated organ , it is 94.43 % correct.But it has 6.57 % of chance to fail to predict.
It also lets us know which factors are playing a role to predict these results.

These results are pretty high results which have not been reached by using the conventional statistical techniques which is pretty promising for the future success of the heart transplants in the future.











  

4 comments:

  1. In Canada, big data is being used to detect infections in ICU babies before its too late. By harnessing millions of heartbeat measurements from the ICU each day, infections can now be detected at least 24 hours before they become symptomatic, thus allowing doctors to get a head-start on treatment. However, medicine is just one of many fields advancing through big data.
    There's a little gadget that one can wear on wrist, called the Jawbone. This measures sleep. It tells how much good sleep, deep sleep. It shows the pace during the day. It shows what time of the day one is most active, and what time not.
    Three years from now, wristbands, like the Jawbone, will be doing blood oxygen and glucose levels. This might be a way to give us early health warnings. This is big data. A lot of people say, "No, it's a little device. It's a tiny metadata." But if every single human being was wearing it, it might give us this other view of our health, of our activity, of correlating food and diet, with nutrition and disease.

    Source: http://humanfaceofbigdata.com/blog/

    ReplyDelete
  2. After reading this blog i very strong in this topics and this blog really helpful to all
    Big data hadoop online training India

    ReplyDelete
  3. You have provided a nice article, Thank you very much for this one. And I hope this will be useful for many people. Salesforce Training Cretification

    ReplyDelete
  4. Thanks for providing a useful article containing valuable information. start learning the best online software courses.
    Workday Training

    ReplyDelete