Poster Presentation 41st Lorne Genome Conference 2020

A comprehensive survey of Alu repeat diversity in humans (#108)

Renzo Balboa 1 , Simon Easteal 1 , Hardip Patel 1
  1. National Centre for Indigenous Genomics, Australian National University, Acton, ACT, Australia

More than half of the human genome is comprised of repeat sequences, with a large proportion annotated as transposable elements and long duplicated regions. Alu elements are primate-specific ~300bp repeat sequences considered to be the most successful transposable elements in the human genome. In any individual, it is believed that there are ~1.1 million copies of Alu repeats, comprising ~11% of the human genome. Their evolutionary characteristics; high activity (~1:20 births), high polymorphism rates and neutral evolutionary mechanisms are useful for understanding human diversity. Alu elements have also been implicated in genome regulation, where for example, Alus contribute up to ~30% of methylation sites in the human genome. Therefore, a comprehensive understanding of Alu polymorphisms is useful in population genetics and in understanding their contribution in health and disease. The 1000 Genomes Project revealed ~12000 non-reference Alu insertions, but did not capture the extent of Alu polymorphisms at a global population scale. Here, we utilise an assembly-based approach on short-read data from 150 global populations and long-read data from 15 individuals to comprehensively annotate Alu variations at an individual and population level. We detect ~1.3 million Alu across all individuals, and find that shared elements carry a largely similar profile. We find that Alu elements largely contribute to structural variation in the human genome; ~15% or ~180000 elements are unaccounted for in the reference genome. We also show population-specific differences in Alu calls at both a population and subfamily level, estimating that ~1000 differ between any two individuals; AluY elements, the youngest and most active Alu subfamily are believed to be the most polymorphic between individuals. To the best of our knowledge, this is the first comprehensive survey of Alu polymorphisms in humans, which will pave way for understanding their roles in DNA regulation and in health and disease.