Chromosomal periodicity of evolutionarily conserved gene pairs

Abstract
Chromosomes are compacted hundreds of times to fit in the cell, packaged into dynamic folds whose structures are largely unknown. Here, we examine patterns in gene locations to infer large-scale features of bacterial chromosomes. Specifically, we analyzed >100 genomes and identified thousands of gene pairs that display two types of evolutionary correlations: a tendency to co-occur and a tendency to be located close together in many genomes. We then analyzed the detailed distribution of these pairs in Escherichia coli and found that genes in a pair tend to be separated by integral multiples of 117 kb along the genome and to be positioned in a 117-kb grid of genomic locations. In addition, the most pair-dense locations coincide with regions of intense transcriptional activity and the positions of top transcribed and conserved genes. These patterns suggest that the E. coli chromosome may be organized into a 117-kb helix-like topology that localizes a subset of the most essential and highly transcribed genes along a specific face of this structure. Our approach indicates an evolutionarily maintained preference in the spacing of genes along the chromosome and offers a general comparative genomics framework for studying chromosome structure, broadly applicable to other organisms.