Data Direct

DataDirect ( is a self-serve software tool for researchers, enabling them to access and explore clinical data from electronic health records (EHR), such as diagnoses, encounters, procedures, medications (ordered and administered), and labs (ordered and results) on more than 4 million unique patients from across Michigan Medicine.

Through the efforts of Precision Health’s Data Analytics & IT workgroup, DataDirect—a tool previously accessible only to Michigan Medicine faculty—is now open to U-M faculty and their researchers campus-wide. DataDirect’s user-friendly interface allows researchers to discover information quickly and easily.

DataDirect Deidentified Mode

DataDirect Cohort Discovery Mode


Deidentified Mode

Deidentified mode, used with appropriate oversight, provides researchers the ability to analyze deidentified patient health data. Resulting datasets will be loaded onto a HIPAA-compliant, secure virtual machine managed by Advanced Research Computing (ARC).

Prerequisites for accessing DataDirect deidentified mode are:

*If you already have a HUM# from an IRB-approved study, for the purposes of  beta testing, you may use it on DataDirect. If you do not have an IRB-approved study, please select the “Human Subjects Study Application” option on the linked webpage and fill out the form. For the purposes of beta testing (and for any deidentified DataDirect data requests), an IRB self-determination for not regulated research is the correct option. At the end of the eResearch application, you will be issued a self-determination for not regulated research and a HUM#. You will need to use this HUM# on DataDirect to run any query and perform beta testing.

After accessing DataDirect and submitting a query, researchers may wait up to two days while their datasets are reviewed. Then, they will be able to view and analyze these data on a secure virtual machine using Yottabyte or Armis.

For questions about DataDirect, please email


Process Flowchart



Yottabyte Research Cloud (YBRC)

YBRC is a private cloud environment that provides high-performance, secure, and flexible computing environments enabling the analysis of sensitive datasets restricted by federal privacy laws, proprietary access agreements, or confidentiality requirements. The system is built on Yottabyte’s composable, software-defined infrastructure platform and represents U-M’s first use of software-defined infrastructure for research, allowing on-the-fly personalized configuration of any-scale computing resources. This platform allows us to create any combination of network, CPU, RAM, and storage components into resource groups that can be used to build multi-tenant, multi-site infrastructure as a service.

YBRC Mac User Walkthrough

YBRC Windows User Walkthrough

YBRC Web-Based User Walkthrough (Mac and Windows)



The HPC environment is composed of task-managing administrative nodes and standard Linux-based two- and four-socket server class hardware in a secure data center, connected by both a high-speed ethernet (1 Gbps) and InfiniBand network (40Gbps), and a secure parallel file system for temporary data, provided by HIPAA-aligned Turbo Research Storage. Armis is currently provided as a pilot service. The two-socket nodes have up to 24 cores and 156 GB of memory.  There are also eight K20x GPUs currently on the cluster, but others can be moved on request.

How to use Armis with DataDirect
If you are a new user of Armis, you will need to create an account by submitting an application form (this form is accessible via the Armis User Guide homepage). On the form, please specify a) the PH-based need for an Armis account, and b) the HUM#(s) associated with your data request(s) on DataDirect (without this information, ARC-TS won’t be able to create an Armis account). Please allow one business day for your application to be processed. If you already have an Armis account, you will need to send an email to specifying a) the PH-based need to use your Armis account, and b) the HUM#(s) associated with your data request(s) on DataDirect.

Armis User Guide

Armis Account Application

For questions about Armis or YBRC, please email

Cohort Discovery Mode

The simplest DataDirect mode provides aggregate counts for cohort discovery (i.e., assembling a group of individuals with parameters of interest). This simple form of the tool allows researchers to explore whether the data contained in DataDirect are sufficient to support their research.

DataDirect Cohort Mode User Guide

DataDirect Cohort Mode Video Tutorial

Prerequisites for accessing DataDirect cohort discovery mode are:

  • Level-1 password
  • Completion of HIPAA Training
  • Enrollment in DUO Authentication
  • U-M faculty position, or U-M staff/student with a faculty sponsor. Faculty are responsible for uploading uniqnames for their staff/students into DataDirect.



DataDirect is managed by Michigan Medicine’s Data Office for Clinical and Translational Research, which oversees access to several institutionally supported tools and also provides customized datasets in consultation with researchers. The Data Office administers a secure and compliant process for researchers requiring Michigan Medicine data.

Health data are organized according to the International Statistical Classification of Diseases and Related Health Problems (ICD). The ICD, published by the World Health Organization, is a global health information standard for mortality and morbidity statistics. Determining which codes meet your inclusion criteria is a clinical decision to be made by your research team. As several different codes can be associated with a diagnosis or procedure, consulting with a clinician who specializes in that area, or reviewing the charts of patients known to have the diagnosis or procedure, can assist in this determination.