Full-length sequence and mosaic structure of a human immunodeficiency virus type 1 isolate from Thailand

Abstract
Human immunodeficiency virus type 1 isolates of envelope genotype E are contributing substantially to the global pandemic. These strains appear to be mosaics, with the gag gene from clade A and the envelope from clade E; the parental clade E strain has not been found. Here we report the first full genomic sequence of one such mosaic virus, isolate CM240 from Thailand. Multiple breakpoints between the two parental genotypes have been found in a CM240 virus. The entire gag-pol region and most, if not all, of the accessory genes vif, vpr, tat, rev, and vpu appear to derive from clade A. The genotype switches to E shortly after the signal peptide of the envelope and back to clade A near the middle of gp41; thus, the portion of the envelope that lies on the cytoplasmic side of the membrane appears to be principally derived not from clade E, as previously thought, but from clade A. Another small segment not belonging to any recognized clade and presumably also contributed by the parental E strain has been found in the long terminal repeat. It may be significant that the implied virion structure resembles a pseudotype virus with the matrix and core from one clade and the outer envelope from another. In the long terminal repeat, differences were observed between CM240 and other clades in the number of NF-kappa B binding sites, the sequence of the TATA box, and the putative secondary structure of the transactivation response region stem-loop. The mosaic structure of a CM240 virion is suggestive of phenotypic differences which might have contributed to the emergence of this variant.