Poster Presentation 41st Lorne Genome Conference 2020

Identifying functional pseudogenes in the human transcriptome by long-read cDNA sequencing (#121)

Seth W Cheetham 1 , Robin-Lee Troskie 1 , Adam D Ewing 1 , Geoffrey J Faulkner 1 2
  1. Mater Research-University of Queensland, Wooloongabba, QLD, Australia
  2. Queensland Brain Institute, University of Queensland, St Lucia, QLD, Australia

Pseudogenes are mutant copies of genes that have been thought of as functionless relics of evolution. Several pseudogene-derived long noncoding RNAs (lncRNAs) have been shown to regulate tumorigenesis (PTENP1, BRAFP1), inflammation (Lethe) and diabetes (HMGA1-p) through RNA-intrinsic functions. However, the extent, tissue-specificity and functional impact of the human pseudogene transcriptome are currently unclear. The impact of pseudogenes in human biology has been understudied due to technical limitations that preclude accurate quantification of pseudogene transcripts. Most short RNA-seq reads do not align uniquely to pseudogenes and cannot confidently distinguish highly similar pseudogene and parent gene transcripts.  Furthermore, the 5’ and 3’ termini of pseudogene transcripts are poorly defined, inhibiting functional experiments such as overexpression and targeting of pseudogene transcriptional start sites with CRISPR interference. PacBio IsoSeq cDNA reads harbour enough sequence differences to accurately quantify pseudogene and parent gene transcription. Using deep full-length cDNA sequencing of normal human tissues and cancer cell lines, we determined that at least 32% of pseudogenes are transcribed in tissue-specific patterns. Most identified pseudogene transcripts incorporate novel unannotated exons, promoters and transcriptional termination sites. Many pseudogenes exhibit complex splicing patterns and are form exons of protein-coding genes. Hundreds of pseudogene transcripts contain intact open-reading frames that have the capacity to encode functional proteins.  Overexpression and CRISPR-Cas9-mediated deletion screens targeting highly expressed pseudogenes revealed that transcribed pseudogenes have play important functional roles in signalling pathways and cell-cycle control.  This study identifies a complex, dynamic human pseudogene transcriptome that may have key human health and disease impacts.