Predicting Human Nucleosome Occupancy from Primary Sequence

Open Access

22 August 2008

journal article
research article
Published by Public Library of Science (PLoS) in PLoS Computational Biology

Vol. 4 (8), e1000134
https://doi.org/10.1371/journal.pcbi.1000134

Abstract

Nucleosomes are the fundamental repeating unit of chromatin and comprise the structural building blocks of the living eukaryotic genome. Micrococcal nuclease (MNase) has long been used to delineate nucleosomal organization. Microarray-based nucleosome mapping experiments in yeast chromatin have revealed regularly-spaced translational phasing of nucleosomes. These data have been used to train computational models of sequence-directed nuclesosome positioning, which have identified ubiquitous strong intrinsic nucleosome positioning signals. Here, we successfully apply this approach to nucleosome positioning experiments from human chromatin. The predictions made by the human-trained and yeast-trained models are strongly correlated, suggesting a shared mechanism for sequence-based determination of nucleosome occupancy. In addition, we observed striking complementarity between classifiers trained on experimental data from weakly versus heavily digested MNase samples. In the former case, the resulting model accurately identifies nucleosome-forming sequences; in the latter, the classifier excels at identifying nucleosome-free regions. Using this model we are able to identify several characteristics of nucleosome-forming and nucleosome-disfavoring sequences. First, by combining results from each classifier applied de novo across the human ENCODE regions, the classifier reveals distinct sequence composition and periodicity features of nucleosome-forming and nucleosome-disfavoring sequences. Short runs of dinucleotide repeat appear as a hallmark of nucleosome-disfavoring sequences, while nucleosome-forming sequences contain short periodic runs of GC base pairs. Second, we show that nucleosome phasing is most frequently predicted flanking nucleosome-free regions. The results suggest that the major mechanism of nucleosome positioning in vivo is boundary-event-driven and affirm the classical statistical positioning theory of nucleosome organization. Inside the nucleus, DNA is wrapped into a complex molecular structure called chromatin, whose fundamental unit is ∼150 bp of DNA organized around the eight-histone protein complex known as the nucleosome. Understanding the local organization of nucleosomes is critical for understanding how chromatin impacts gene regulation. Here, we describe a computational model that predicts nucleosome placement from DNA sequence. We train the model using data derived from human cell lines, and we apply the model systematically to 1% of the human genome. We show that previously described models trained from yeast data correlate strongly with the human-trained model, suggesting a common mechanism for sequence-based determination of nucleosome occupancy. In addition, we observe a striking complementarity between models trained using data from weakly and strongly digested samples: one type of model recognizes nucleosome-free regions, whereas the other identifies well-positioned nucleosomes. Finally, our analysis of predicted nucleosome positions in the human genome allows us to identify common features of nucleosome-forming and inhibitory sequences. Overall, our results are consistent with the classical statistical positioning theory of nucleosome organization.

Keywords

This publication has 57 references indexed in Scilit:

Dynamic Regulation of Nucleosome Positioning in the Human Genome
Cell, 2008
DNA sequence- and conformation-directed positioning of nucleosomes by chromatin-remodeling complexes
Proceedings of the National Academy of Sciences, 2007
Nucleosome positioning signals in genomic DNA
Genome Research, 2007
Independent and complementary methods for large-scale structural analysis of mammalian chromatin
Genome Research, 2007
Flexibility and constraint in the nucleosome core landscape of Caenorhabditis elegans chromatin
Genome Research, 2006
A genomic code for nucleosome positioning
Nature, 2006
Single-Nucleosome Mapping of Histone Modifications in S. cerevisiae
PLoS Biology, 2005
Evidence for nucleosome depletion at active regulatory regions genome-wide
Nature Genetics, 2004
Sequence periodicities in chicken nucleosome core DNA
Journal of Molecular Biology, 1986
DNA bending and its relation to nucleosome positioning
Journal of Molecular Biology, 1985

Cited by 115 articles