Abstract
Recent studies of genome-wide transcriptional regulatory network (TRN) revealed several intriguing structural and dynamic features of gene expression at a system level. Unfortunately, the network under study is often far from complete. A critical question is thus how much the network is incomplete and to what extent this would affect the results of analysis. Here we compare the Escherichia coli TRN built by Shen-Orr et al. (Nature Genet., 31, 64-68) with two TRNs reconstructed from RegulonDB and Ecocyc respectively and present an extended E.coli TRN by integrating information from these databases and literature. The scale of the extended TRN is about twice as large as the previous ones. The new network preserves the multi-layer hierarchical structure which we recently reported but has more layers. More global regulators are inferred. While the feed forward loop (FFL) is confirmed to be highly representative in the network, the distribution of the different types of FFLs is different from that based on the incomplete network. In contrast to the notion of motif aggregation and formation of homologous motif clusters, we found that most FFLs interact and form a giant motif cluster. Furthermore, we show that only a small portion of the genes is solely regulated by only one FFL. Many genes are regulated by two or more interacting FFLs or other more complicated network motifs together with transcriptional factors not belonging to any network motifs, thereby forming complex regulatory circuits. Overall, the extended TRN represents a more solid basis for structural and functional analysis of genome-wide gene regulation in E.coli.