Identification of Cis-Regulatory Sequences Controlling Pollen-Specific Expression of Hydroxyproline-Rich Glycoprotein Genes in Arabidopsis thaliana
Preprint
- 20 August 2020
- preprint
- Published by Cold Spring Harbor Laboratory in bioRxiv
Abstract
Hydroxyproline-rich glycoproteins (HRGPs) are a superfamily of plant cell wall structural proteins that function in various aspects of plant growth and development, including pollen tube growth. We have previously characterized HRGP superfamily into three family members: the hyperglycosylated arabinogalactan-proteins, the moderately glycosylated extensins, and the lightly glycosylated proline-rich proteins. However, the mechanism of pollen-specific HRGP expression remains untouched. To this end, we developed an integrative analysis pipeline combining RNA-seq gene expression and promoter sequences that identified 15 transcriptional cis-regulatory motifs responsible for pollen-specific expression of HRGP in Arabidopsis Thaliana. Specifically, we mined the public RNA-seq datasets and identified 13 pollen-specific HRGP genes. Ensemble motif discovery with various filters identified 15 conserved promoter elements between Thaliana and Lyrata. Known motif analysis revealed pollen related transcription factors of GATA12 and brassinosteroid (BR) signaling pathway regulator BZR1. Lastly, we performed a machine learning regression analysis and demonstrated that the identified 15 motifs well captured the HRGP gene expression in pollen (R=0.61). In conclusion, we performed the integrative analysis as the first-of-its-kind study to identify cis-regulatory motifs in pollen-specific HRGP genes and shed light on its transcriptional regulation in pollen.All Related Versions
- Published version: Plants, 9 (12), 1751.
This publication has 22 references indexed in Scilit:
- A Novel RGL2–DOF6 Complex Contributes to Primary Seed Dormancy in Arabidopsis thaliana by Regulating a GATA Transcription FactorMolecular Plant, 2017
- Motif discovery and transcription factor binding sites before and after the next-generation sequencing eraBriefings in Bioinformatics, 2012
- GimmeMotifs: a de novo motif prediction pipeline for ChIP-sequencing experimentsBioinformatics, 2010
- NWChem: A comprehensive and scalable open-source solution for large scale molecular simulationsComputer Physics Communications, 2010
- Brassinosteroids control male fertility by regulating the expression of key genes involved in Arabidopsis anther and pollen developmentProceedings of the National Academy of Sciences, 2010
- RNA-Seq: a revolutionary tool for transcriptomicsNature Reviews Genetics, 2009
- Clustal W and Clustal X version 2.0Bioinformatics, 2007
- The brassinosteroid signal transduction pathwayCell Research, 2006
- Identifying tissue-selective transcription factor binding sites in vertebrate promotersProceedings of the National Academy of Sciences, 2005
- Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specificationBioinformatics, 2004