Poster Presentation 41st Lorne Genome Conference 2020

Comapr: An R package for calculating crossover frequency and genetic map construction (#209)

Ruqian Lyu 1 2 , Davis McCarthy 1 2
  1. St Vincent Institute of Medical Research, Fitzroy, VICTORIA, Australia
  2. Melbourne Integrative Genomics, University of Melbourne, Melbourne , VIC, Australia

Meiotic crossovers during gametogenesis generate genetic diversity and crossover numbers and positioning are tightly regulated. What factors regulate meiotic crossovers in different species is an active research area. The genetic distance, measured in centiMorgans, between two markers, is derived based on the observed crossover between two loci via a mapping function (e.g. Haldane or Kosambi). Genetic maps of species show the genetic distances of genomic markers across the genome. Crossover interference is a well-conserved phenomenon where a crossover reduces the probability of a crossover at an adjacent locus in a distance-dependent manner. Populations with more meiotic crossovers have larger total genetic map lengths.

A recombination event between two polymorphic markers can be detected by analysing the marker segregation in offspring. Genetic distances between two markers are then calculated by applying a mapping function to the fraction of observed recombinant offspring. Traditionally, an excel macro tool (i.e MapDisto) is used for estimating recombination frequencies and constructing genetic maps. This approach requires extensive manual inputs and is not designed to integrate with modern bioinformatics analysis tools.

Comapr is an R/Bioconductor package that analyses genotyping test reports of SNP markers for groups of samples to detect crossovers, apply mapping functions and construct genetic maps. It includes functions for evaluating various marker and sample quality metrics and filtering based on these metrics. It also includes a function that uses bootstrapping to statistically test for differences in the total genetic map lengths between experimental groups as well as the function for estimating and comparing the strength of crossover interference by fitting a gamma distribution on genetic distances. Finally, Comapr includes functions that allow users to generate high quality static and interactive visualisations that are directly interpretable. Comapr is a light and easy-to-use package that integrates seamlessly with other modern genomic data analysis tools.