An encyclopedia of mouse genes

Abstract
The laboratory mouse is the premier model system for studies of mammalian development due to the powerful classical genetic analysis1 possible (see also the Jackson Laboratory web site, http://www.jax.org/ ) and the ever–expanding collection of molecular tools2,3. To enhance the utility of the mouse system, we initiated a program to generate a large database of expressed sequence tags (ESTs) that can provide rapid access to genes4,5,6,7,8,9,10,11,12,13,14,15,16. Of particular significance was the possibility that cDNA libraries could be prepared from very early stages of development, a situation unrealized in human EST projects7,12. We report here the development of a comprehensive database of ESTs for the mouse. The project, initiated in March 1996, has focused on 5´ end sequences from directionally cloned, oligo–dT primed cDNA libraries. As of 23 October 1998, 352,040 sequences had been generated, annotated and deposited in dbEST, where they comprised 93% of the total ESTs available for mouse. EST data are versatile and have been applied to gene identification17, comparative sequence analysis18,19, comparative gene mapping and candidate disease gene identification20, genome sequence annotation21,22, microarray development23 and the development of gene–based map resources24.