Poster Presentation 41st Lorne Genome Conference 2020

The cis-regulatory code of response to combined heat and drought stress in Arabidopsis thaliana (#107)

Christina B Azodi 1 2 3 , John P Lloyd 4 5 , Shin-Han Shiu 2 3 6
  1. St. Vincent's Institute, Fitzroy, VIC, Australia
  2. Department of Plant Biology, Michigan State University, East Lansing, Michigan, USA
  3. The DOE Great Lakes Bioenergy Research Center, Michigan State Universtiy, East Lansing, Michigan, USA
  4. Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, USA
  5. Translational Genomics Research Institute, Phoenix, Arizona, USA
  6. Department of Computational, Mathematics, Science, and Engineering, Michigan State University, East Lansing, Michigan, USA

Plants must dynamically adapt to their environment to survive. Transcription factors (TFs) play a major role in regulating transcriptional response to environmental stress. Thus, a powerful approach for studying the regulatory mechanisms of stress response is to integrate information about cis-regulatory elements (CREs), non-coding sequences near genes where TFs bind, into models of the cis-regulatory code. However, regulatory mechanisms can conflict in unexpected ways when a plant is exposed to multiple stressors simultaneously.

Here we modeled the cis-regulatory code of transcriptional response to single and combined heat and drought stress in Arabidopsis thaliana. We grouped differentially expressed genes (n=3,218; microarray data) by their pattern of response (i.e. independent, antagonistic, synergistic). For example, genes up-regulated under combined but neither single stress were considered synergistic. We trained Random Forest models to classify genes by their response group using the presence/absence of putative CREs (pCREs) as features (median F-measure = 0.64). Finally, we developed a deep learning-based approach that allowed us to integrate additional types of omic data (e.g. chromatin accessibility, histone modification) into our models of the cis-regulatory code using convolutional layers. This approach improved the median performance by 6.2%.

We used machine learning interpretation techniques to find the most important pCREs for each response group. For the independent and antagonistic responses, important pCREs tended to resemble binding motifs of TFs associated with heat and/or drought stress (e.g. WRKY46, DDF2, and REVEILLE8). However, the pCREs most important for predicting synergistic responses resembled binding motifs of TFs not known to be associated with heat or drought (e.g. bHLH104) or with stress at all (e.g. γMYB2). These findings demonstrate how in silico approaches can improve our understanding of the complex code regulating transcriptional response to combined stress and help us to identify candidate CRE that are prime targets for future characterization.