Template Projects

Genomics, Transcriptomics, and Proteomics
Medical Image Analysis
Personalized Medicine
Electronic Medical Records
Population Health Modeling

To get started, we have identified five major example domains of medical data science. Each domain has a Template Project that allows one to quickly recreate an example data analysis from a publication in the given domain. These projects contain the preprocessed primary datasets, cyberinfrastructures, and software methods necessary for the recreation.

Genomics, Transcriptomics, and Proteomics

Learn about using multi-omics data from the Cancer Genome Atlas (TCGA) for patient stratification, survival analysis, biomarker discovery, and pathway impact. The analysis will use a Jupyter notebook and web-based KnowEnG platform. More details can be found [here].

Medical Image Analysis

Learn how to assess blackbox methods for classifying chest x-rays for different pathologies using the Google Cloud Platform AutoML automated deep learning system. Google Cloud Platform access will need to be requested. More details are found [here].

Personalized Medicine

Learn how to identify which clinical features and what subgroups of sepsis patients will most benefit from specific treatment regimes using the Virtual Twins modelling approach in a Juptyer notebook. More details are found [here].

Electronic Medical Records

Learn to predict patient readmission events from diagnostic codes in the MIMIC-III database using the DoctorAI recurrent neural network. The analysis will require access to the AWS-based Cloud9 command line environment and credentials for the MIMIC data. More details are found [here].

Population Health Modeling

Through multiple Jupyter notebooks, learn how to use SIR and SEIR models of disease transmission to estimate the predicted effect of different COVID-19 policies on spread and hospital resource utilization. More details are found [here].