Democratizing genetic data for cancer research
An open-access, online repository, Cancer PRSWeb provides polygenic risk scores for 35 cancer traits to accelerate advancements in cancer prevention and prediction.
By distilling information from hundreds of data sources, including published genome-wide association studies (GWAS) and meta-analyses, University of Michigan Precision Health researchers generated polygenic risk scores (PRS) for 35 common types of cancer. They then evaluated the scores using electronic health record data from the Michigan Genomics Initiative and the UK Biobank. The result is a free, online repository accessible to researchers everywhere, and it’s the subject of a recent paper in the American Journal of Human Genetics.
“We built Cancer PRSweb to offer convenient and open access to PRS that might provide valuable directions for the translation of PRS from basic research toward potential clinical utility and accelerate scientific discoveries,” said lead author Lars Fritsche, PhD, an Associate Research Scientist in Biostatistics in the School of Public Health and the recipient of a 2018 Precision Health Investigators Award for this research. “We hope this platform will be useful to the cancer research community worldwide.”
Generating PRS is a complicated, time-consuming process. While some diseases, such as cystic fibrosis, can be traced to variants on a single gene, others, such as diabetes and cancer, result from a number of genetic variants in combination with how one’s environment and lifestyle influence gene expression. To more accurately determine who is at high risk of developing these polygenic diseases, researchers must look at not just one genetic variant, but a number of genetic variants in concert.
Development and selection of PRS is challenging for researchers due to limited access to full summary statistics of genome-wide association studies, the lack of directly transferable PRS constructs, and the analytic expertise needed to discern between different PRS to choose the most appropriate one for a given study. Cancer PRSWeb attempts to bridge that gap.
With access to Cancer PRSWeb, researchers save time and money, which expedites scientific and clinical discoveries. “The generation and exploration of PRS can be very time consuming and computationally expensive,” said author Bhramar Mukherjee, PhD, Chair of Biostatistics, Associate Workgroup Director for Cohort Development at Precision Health, and Associate Director for Quantitative Data Sciences at Rogel Cancer Center. “For cancer traits in PRSweb, we have already done the hard work and offer the ability to rank PRS based on various performance metrics that indicate their suitabilities for different purposes. For example, there might be one PRS that works better for prediction accuracy, while another one has an edge for risk stratification purposes. These distinctions, combined with the transparent construction, advances the push toward clinical utilization of PRS.”
Clinically useful, because PRS applied to the medical phenome “offer the opportunity to uncover secondary trait associations that share a genetic component with the primary trait of interest,” said Fritsche, which in turn could “uncover key pre-symptomatic diagnoses or biomarkers that themselves could be used as predictors.”
A 2018 Precision Health Investigators Award helped propel this important research. “It was essential to help seed our team effort, which required different skill sets in human genetics, statistics, computation, and visualization,” said Fritsche. “Precision Health events, furthermore, allowed us to discuss our platform with a growing multidisciplinary community and provided vital feedback for a possible integration of PRS into clinical applications.” A supplement to the cancer center core grant from the National Cancer Institute also enabled this research.
With an initial repository for cancer PRS set up, the research team is applying lessons learned to explore other areas, such as mental health and endocrine/metabolic disorders. “Our ultimate goal,” said Mukherjee, “is the creation of PRS for every complex trait, with a decent proportion of its heritability explained through common risk variants.” Also planned are PRS for common risk factors for chronic diseases, smoking, obesity, physical activity, and some blood and serum biomarkers.