Tools & Resources

Analytics Platform

The Precision Health Analytics Platform is a suite of tools, services, and datasets available to researchers across campus. View the complete Analytics Platform User Guide [pdf] and other resources on the PH Analytics Platform Documentation site (UMICH [Level-1] password required).

Schedule a virtual consultation with a Research Scientific Facilitator to learn more about how the Analytics Platform can enhance your research.

Resources include:

Resource Resource type Description More information
(umich password required)
De-identified electronic health record data through Precision Health DataDirect tool
  • This self-serve tool is de-identified, date-shifted, and updated quarterly.
  • It houses de-identified clinical data for more than 4M Michigan Medicine patients.
  • It includes laboratory results, ordered and administered medications, vital sign measurements, diagnoses, and other structured elements collected during inpatient and outpatient visits.
Analytics Platform User Guide
Increased download capability for patient results functionality
  • Depending on your cohort, you may download de-identified data from an unlimited number of patients (increased from the originally 20K-patient limit on all cohorts).
Armis2 infrastructure
  • High-performance computing environment administered by Advanced Research Computing Technology Services (ARC-TS)
  • Linux-based
  • Used for analysis of sensitive data
Accessing Analytics Environments
High-performance compute platform infrastructure
  • Six compute nodes on Armis2 dedicated to Precision Health
  • Each node contains eight RTX 2080 TI GPUs for researcher usage
Accessing Analytics Environments
Yottabyte Research Cloud infrastructure
  • Private cloud platform for virtual machines used for research
  • Enables analysis of sensitive datasets
  • Composable, software-defined infrastructure
Accessing Analytics Environments
Social determinants of health data
  • A geolocation overview displays selected socioeconomic characteristics of neighborhoods.
  • Addresses have been geocoded (assigned longitude and latitude coordinates) and mapped to US Census Tract data elements.
  • Social determinant indices available for download include affluence, disadvantage, ethnicity, and education.
Accessing Neighborhood-based Socioeconomic Status Data for Patient Cohorts using DataDirect
Michigan Medicine/ Precision Health COVID surveys data
  • Aimed at obtaining important COVID-19 health-related information that can be linked to other health data (clinical, genetic, specimen, etc.).
  • Survey participants are enrolled in the U-M Central Biorepository.
  • Available for download in DataDirect.
About the COVID-19 survey
COVID starting population
(predefined and validated cohort)
  • Patients who have tested positive for SARS-CoV-2 at Michigan Medicine or who at any point carried a diagnosis of COVID-19
  • Complete through April 30, 2020, and updated quarterly
COVID-19 Data via DataDirect
Diabetes starting population data
  • Starting populations in DataDirect enable easier filtering of Michigan Medicine patients for specific diagnoses or groups. These are validated by experts in the field.
  • Comprises living Michigan Medicine patients with diabetes, the majority of whom are managed for diabetes by ambulatory care.
Chest X-ray data data
  • More than 5,000 de-identified chest images performed on patients who were tested for COVID-19 during hospitalizations


Accessing the COVID-19 chest X-ray Dataset on Turbo
Michigan Genomics Initiative (MGI) data
  • Repository of DNA and genetic data linked to medical phenotype and electronic health record (EHR) information
MGI webpage
Preoperative MGI participant surveys data
  • These surveys leverage domains of patient-reported outcomes, including pain severity, physical function, depression, anxiety, and life stress scales.
  • Responses from more than 55K MGI participants.
  • Available for download in DataDirect.
Star allele calls service
  • Pharmacogenomics information for Michigan Genomics Initiative participants
  • Provides in-silico calls for star alleles and activity phenotypes
  • Translation of genetic data into star alleles allows research into genetic predictors of medication treatment outcomes
Michigan Genomics Initiative: Accessing Star Allele Calls



DataDirect is a self-serve software tool enabling researchers to access and explore clinical data from the Michigan Genomics Initiative cohort and the electronic health records (EHR) of more than 4 million unique patients . Researchers may use DataDirect to generate aggregate counts for cohort study (“Cohort Discovery Mode”) or to analyze de-identified patient health data (“De-Identified Mode”). See the Analytics Platform User Guide for information on DataDirect modes. U-M VPN login is required to access Precision Health DataDirect.

DataDirect is managed by Michigan Medicine’s Data Office for Clinical and Translational Research (DOCTR), which oversees access to several institutionally supported tools and also provides customized datasets in consultation with researchers. The Data Office administers a secure and compliant process for researchers requiring Michigan Medicine data. All users of Precision Health DataDirect are required to complete robust human subjects research training and appropriate data use agreements.

Linked Data

The Precision Health Analytics Platform, using Michigan Medicine Data Office tools and resources, provides access to genetic and clinical data on approximately 80K patients. This includes the ability to link clinical phenotype data to genotype data and facilitation of GWAS analysis.

Researchers can access their data in a secure, virtual, high-compute Linux- or Windows-based environment.


The Armis2 high-performance computing (HPC) environment is composed of task-managing administrative nodes and standard Linux-based two- and four-socket server class hardware in a secure data center, connected by both a high-speed ethernet (1 Gbps) and InfiniBand network (40/100Gbps), and a secure parallel file system for temporary data, provided by HIPAA-aligned Turbo Research Storage. The two-socket nodes have up to 24 cores and 156 GB of memory.  There are also 12 V100 GPUs currently on the cluster, but others can be moved on request.

If you are a new user of Armis2, you will need to create an account by submitting an application form [umich password required]; this form is also accessible via the Armis2 User Guide homepage. On the form, please specify a) the PH-based need for an Armis2 account, and b) the HUM#(s) associated with your data request(s) on DataDirect (without this information, ARC-TS won’t be able to create an Armis2 account). Please allow one business day for your application to be processed. If you already have an Armis2 account, you will need to send an email to specifying a) the PH-based need to use your Armis2 account, and b) the HUM#(s) associated with your data request(s) on DataDirect.

Precision Health also has a private set of six nodes on Armis2. Each node has eight (48 total) RTX2080Ti GPUs and large volumes of fast local storage, and can see all data and software provided on Armis2. These nodes are optimized for machine learning/AI, computer vision, molecular dynamics, and any other GPU-accelerated workload. Precision Health–affiliated researchers who have interest in using the condo nodes should contact


The Yottabyte Research Cloud (YBRC) is a private cloud environment that provides high-performance, secure, and flexible computing environments enabling the analysis of sensitive datasets restricted by federal privacy laws, proprietary access agreements, or confidentiality requirements. The system is built on Yottabyte’s composable, software-defined infrastructure platform and represents U-M’s first use of software-defined infrastructure for research, allowing on-the-fly personalized configuration of any-scale computing resources. This platform allows the creation of any combination of network, CPU, RAM, and storage components into resource groups that can be used to build multi-tenant, multi-site infrastructure as a service.

Please use these guides (umich password required) for accessing your data. For questions about Armis or YBRC, please email

Research Scientific Facilitators

Precision Health Research Scientific Facilitators are on hand to guide investigators across campus through processes that allow them to assemble datasets in a virtual, HIPAA-compliant server environment. Facilitators help researchers navigate self-serve tools such as DataDirect and EMERSE, find other ways of pulling clinical data (through DOCTR), submit biospecimen inquiries, assemble subject survey data, and more. Facilitators also strive to identify and integrate additional data lakes for centralized use.

Contact the Facilitators at


All IRB applications should go through IRBMED and not HSBS.

Type of IRB approvals needed by investigators for clinical and/or genetic data:

  • Cohort mode provides aggregate counts; no individual-level information is accessible. No IRB application is required.
  • Deidentified mode, which includes includes data from more than 4M Michigan Medicine patients, allows users to access individual-level structured patient health data without any HIPAA identifiers. An IRBMED “not regulated” determination is required by the system.
  • PHI mode allows users to access and download structured patient health data including PHI. Only PHI types relevant to the specific research are accessible, consistent with an IRBMED approval or “exempt” determination; this only sometimes includes personal identifiers.

For IRB applications, please reference MGI HUM00071298.

De-Identified data and genomic data requests on their own are pre-approved by MGI committee, and do not need a specific letter or commitment to submit to IRB. Biospecimen requests and re-contact of MGI patients will need Precision Health MGI Access Committee approvals.

Contact the Data Office for Clinical & Translational Research (DOCTR) with any IRB-related questions: