The Complete DNA Sequence of the Long Unique Region in the Genome of Herpes Simplex Virus Type 1

Abstract
Summary We have determined the DNA sequence of the long unique region (UL) in the genome of herpes simplex virus type 1 (HSV-1) strain 17. The UL sequence contained 107943 residues and had a base composition of 66.9% G+C. Together with our previous work, this completes the sequence of HSV-1 DNA, giving a total genome length of 152260 residues of base composition 68.3% G+C. Genes in the UL region were located by the use of published mapping analyses, transcript structures and sequence data, and by examination of DNA sequence characteristics. Fifty-six genes were identified, accounting for most of the sequence. Some 28 of these are at present of unknown function. The gene layout for UL was found to be very similar to that for the corresponding part of the genome of varicella-zoster virus, the only other completely sequenced alphaherpesvirus, and the amino acid sequences of equivalent proteins showed a range of similarities. In the whole genome of HSV-1 we now recognize 72 genes which encode 70 distinct proteins.