Data Access & Tools

Self-Serve Tools

Precision Health’s self-serve tools enable researchers to access longitudinal, deidentified/date shifted health data of more than 4 million Michigan Medicine patients dating back to 2002. Our clinical data self-serve tools are updated quarterly. In addition, researchers are now able to perform some genetic data analysis in self-serve mode (a genetic data analyst is also available for more complex analyses free of charge). Completion of the prerequisites listed here is required for access. U-M VPN login is required to access all of the components of the Precision Health secure enclave.

We encourage you to connect with us at PHDataHelp@umich.edu so we can facilitate your access.

Precision Health DataDirect

Through DataDirect, researchers can access and query deidentified clinical structured data of 4M+ Michigan Medicine patients through an intuitive web interface. Users logging in have the ability to work in two different modes:

  • Cohort discovery mode: enables researchers to check feasibility of a study as well as gather aggregate demographic counts of their cohort;
  • Deidentified download mode: enables researchers to download health data about individual patients in their cohort of interest.

The data output generated by your DataDirect queries will be available through the Precision Health secure enclave, where it can be analyzed either through Yottabyte Research Cloud (YBRC) or Armis2. Precision Health GPUs are also available for researchers who are utilizing Precision Health data and computing environments.

Precision Health Deidentified RDW

Precision Health Deidentified RDW (Research Data Warehouse) offers researchers direct access to database tables that house the clinical structured data of 4M+ Michigan Medicine patients. The Deidentified RDW can be queried in the Precision Health secure enclave through SQL, Python, and RStudio.

Precision Health DataDirect Precision Health Deidentified RDW Michigan Medicine DataDirect
Data Type Deidentified, date-shifted Deidentified, date-shifted Direct identifiers, PHI
Required login Level-1 password Level-1 password Level-2 password
Data refreshed Once per quarter Once per quarter Up to date (1-month lag in billing data)
Recruitment capability Only through Honest Broker services None Recruitment mode, can set up reports
Data storage Precision Health Turbo Precision Health Turbo Investigator’s HIPAA-compliant server
Use case Retrospective data research for a specific cohort of patients without needing exact dates Access database tables directly and analyze large datasets, rather than using self-serve tools like DataDirect Instances where MRNs are required for manual chart review, prospective research studies, subject recruitment
Limitations Health services research where exact dates are required. Need SQL experience to write code.

Health services research where exact dates are required.

Need a HIPAA-compliant storage location

Summary of data types available for Precision Health DataDirect and Deidentified RDW:

Encore (coming soon!)

Encore is a self-serve, web-based analysis tool for running genome-wide association studies (GWAS). Researchers are able to run GWAS in a point-and-click interface without needing to directly manipulate or “touch” the genetic data. Researchers can also view interactive plots of the results and share analysis results with other Encore users.

Encore will be available broadly early in 2022. If you’d like to have a sneak peek and test the tool before the official rollout, send us an email at PHDataHelp@umich.edu.

Benefits of utilizing Encore for your genetic analysis tool:

  • No coding, command-line/Linux knowledge is required to run GWAS in Encore. A basic knowledge of building statistical models and interpreting GWAS results is helpful, but we do have statistical geneticists available to answer any questions and assist with running and interpreting jobs on Encore.
  • You do not need to request, handle, or store large genetic datasets. All the genetic data you may need for your analysis is available to Encore on the backend.
  • You do not need to have knowledge of batch job submission or scheduling, or have direct access to a high-performance computing cluster. Encore automatically prepares job submission scripts and submits the analysis to the Great Lakes cluster. Meanwhile, you will see output from your jobs on the Encore website.
  • Currently, you can submit unlimited GWAS jobs. You can also clone a previous job to tweak parameters.