About the Michigan Genomics Initiative
The Michigan Genomics Initiative (MGI) is a collaborative research effort among physicians, researchers, and patients at the University of Michigan (U-M) with the goal of harmonizing patient electronic health record (EHR) data with genetic data to gain novel biomedical insights. Participants of MGI agree to provide the study team with access to EHR data for clinical information and biospecimens for genotyping. MGI enrollees may also be asked to answer survey questions depending on the clinic from which they are recruited. Each MGI participant also understands that they may be re-contacted in the future for follow-up studies if they have a genotype or clinical condition of interest to investigators across the U-M research enterprise.

Data collected through MGI are available at request to U-M researchers with IRB-approved studies.**

Cohort Profile (Dec. 2020)
There are currently ~80K consented MGI participants, and MGI continues to add ~10K new participants annually. All participants of MGI are 18 years of age or older and undergoing surgery at the University of Michigan Health System. To date, ~57K of MGI participants have been genotyped by a genome-wide genotyping array. Both genotype-inferred and self-reported majority ancestry indicate that the cohort is predominately of European ancestry (> 80%, Figure 1). The genotyped cohort is self-reported as 47% male and 53% female. The median age of male participants is 61 compared to 56 for female participants (Figure 2). The linkage of MGI genetic data to EHR provides researchers with the opportunity to build cohorts for the analysis of a wide range of phenotypes, including many cancers (Figure 3).

Figure 1. Ancestry of genotyped participants of MGI. Plots for majority ancestry that was (A.) inferred from genetic data and (B.) self-reported by study participants.


Figure 2. Distribution of self-reported gender and age of genotyped MGI participants.


Figure 3. Prevalent phenotypes among genotyped MGI participants. Plots depicting the # of cases for the top 20 most common (A.) ICD billing codes and (B.) PheWAS codes describing neoplasms.


Available Genetic Data (Dec. 2020)
Several genotype- and sequence-based datasets are available for request by U-M researchers who would like to perform their own analyses of MGI genetic data.

Data Type Description # Participants w/ Data Type
Genome-wide genotypes ~600K variants directly assayed with a Infinium CoreExome array and genotype imputed to > 51M variants with the TOPMed reference panel or > 32M variants with the HRC reference panel 56,984
Whole Exome Sequences Sequence data covering protein coding gene regions (~2% of genome) 561
Targeted Sequences Sequence data covering 151 targeted gene regions 963
HLA gene allele and amino acid inferences Inferences for human leukocyte antigen genes HLA-A, -B, -C, -DQA1, -DQB1, -DRB1, -DPA1, and -DPB1 56,984
Pharmacogenomic star allele inferences Inferences for 49 distinct pharmacogenes with polymorphic alleles, including CYP2C9, CYP2B6, CYP2C19, CYP3A5, NUDT15, TPMT, SLCO1B1, UGT1A1, DPYD, and CYP2D6* 56,984

* Pharmacogene alleles based on structural variation are not inferred.

Genetic Analysis Resources
The MGI provides free access to several genetic data analysis platforms and services.

Resource Description
MGI PheWeb Online database of genome-wide associations for EHR-derived ICD billing codes from MGI participants
MGI [coming soon] A web-based tool that provides researchers with the ability to perform customized genome-wide association studies using MGI genotype data
Custom genetic analysis Skilled MGI analysts are available to perform custom genetic analyses such as genome-wide association or gene-based analyses

If you are interested in collaborating with MGI, contact PHDataHelp@umich.edu.


**All IRB applications should go through IRBMED and not IRB-HSBS.
Type of IRB approvals needed by investigators for clinical and/or genetic data:

  • Aggregate datasets: No IRB application required.
  • De-Identified datasets: Will need IRB application. At a minimum receive a “not-regulated” status.
  • Datasets with protected health information (PHI): Will require a full IRB review and approval.

For IRB applications, please reference MGI HUM00071298.
De-Identified data and genomic data requests on their own are pre-approved by the MGI committee, and do not need a specific letter or commitment to submit to IRB. Biospecimen requests and re-contact of MGI patients will need MGI committee approvals.
Contact DataOffice@umich.edu with any IRB-related questions.