Healthcare Datamining

16 May 2017


As part of a continued effort, I work closely with Dr. Ravi Rao, primarily, at Fairleigh Dickinson University in an effort to explore public healthcare data while we build a framework to do just that more easily and portably. Leveraging agile development methodologies in our framework, enabling our work to be reproducable, we have developed and published a number of papers in numerous places. Our framework is open source and available on its github page.

GS-LSAMP Applying Big Data Analytics to Open Health Care Data: Exploring Relationships Between Seniority and Performance in Healthcare Practitioners
Daniel Clarke, Ravi Rao, Maryelena Vargas
Abstract: With numerous initiatives to collect and make publicly available data sets in healthcare, a framework based on open source data mining technologies, primarily extending python’s scipy ecosystem, is being designed to process them in search of novel correlations. Using around 2.5 million practitioners, 1,804 hospital evaluations, there is much to be found in the 50 or so fields present in the Center for Medicare and Medicaid’s Hospital Rating data. We use our framework to test a hypothesis about a correlation between age and performance. Visualizations were built to illustrate the findings. Linear regression analysis and Spearman’s rank correlation analysis were used to find results for all hospitals and in individual specialties which suggest that there is no significant correlation between age and reported total performance score.

NJ Tech Conference Presentation