Sotos syndrome common deletion is mediated by directly oriented subunits within inverted Sos-REP low-copy repeats

Abstract
Sotos syndrome (Sos) is an overgrowth disorder also characterized clinically by mental retardation, specific craniofacial features and advanced bone age. As NSD1 haploinsufficiency was determined in 2002 to be the major cause of Sos, many intragenic mutations and chromosomal microdeletions involving the entire NSD1 gene have been described. In the Japanese population, half of the cases analyzed appear to have a common microdeletion; however, in the European population, deletion cases account for only 9%. Blast analysis of the Sos genomic region on 5q35 revealed two complex mosaic low-copy repeats (LCRs) that are centromeric and telomeric to NSD1. We termed these proximal Sos-REP (Sos-PREP, ∼390 kb) and distal Sos-REP (Sos-DREP, ∼429 kb), respectively. On the basis of the analysis of DNA sequence, we determined the size, structure, orientation and extent of sequence identity of these LCRs. We found that Sos-PREP and Sos-DREP are composed of six subunits termed A–F. Each of the homologous subunits, with the exception of one, is located in an inverted orientation and the order of subunits is different between the two Sos-REPs. Only the subunit C′ in Sos-DREP is oriented directly with respect to the subunit C in Sos-PREP. These latter C′ and C subunits are greater than 99% identical. Using pulsed-field gel electrophoresis analysis in eight Sos patients with a common deletion, we detected an ∼550 kb junction fragment that we predicted according to the non-allelic homologous recombination (NAHR) mechanism using directly oriented Sos-PREP C and Sos-DREP C′ subunits as substrates. This patient specific junction fragment was not present in 51 Japanese and non-Japanese controls. Subsequently, using long-range PCR with restriction enzyme digestion and DNA sequencing, we identified a 2.5 kb unequal crossover hotspot region in six out of nine analyzed Sos patients with the common deletion. Our data are consistent with an NAHR mechanism for generation of the Sos common deletion.