Poster Presentation 41st Lorne Genome Conference 2020

Diversity inclusive use of the human reference genome in precision medicine (#220)

Hardip Patel 1
  1. Australian National University, Canberra, ACT, Australia

Genomics is transforming healthcare from being largely treatment-based to being more predictive and preventative. Multidimensional genomic and other biomedical data are combined to facilitate this transformation. The human reference genome is at the heart of this transformation. It serves two fundamental functions: (a) it is the substrate used to align DNA sequence reads, and (b) it provides the standardized coordinate system to anchor information about the function of the human DNA. However, the current reference genome and its use in genomics is far from perfect. A male reference genome with the Y-chromosome included is almost always used for sequence-read mapping for both male and female samples in the variant discovery process. Our analysis of 14 female samples in IGSR data shows that ~1.7 million reads are wrongly assigned to the Y chromosome, with ~10,000 variants affected in female samples. Similarly, novel and fixed patches in the reference genome are almost never used in variant discovery and interpretation to the disadvantage ancestrally diverse populations. For this, we have analysed whole genome sequence data from 400 ancestrally diverse samples (Simon’s Genome Diversity Project, National Centre for Indigenous Genomics, and IGSR) to comprehensively evaluate systematic bias in the incorrect usage of the human reference genome in variant discovery and annotations. Our results demonstrate the importance of standardising the use of reference genome without the Y-chromosome for female samples and inclusion of novel and fix patches in variant discovery process for diversity inclusive genomics for precision medicine.