Identification of alternatively spliced mRNA variants related to cancers by genome-wide ESTs alignment

Abstract
Several databases have been published to predict alternative splicing of mRNAs by analysing the exon linkage relationship by alignment of expressed sequence tags (ESTs) to the genome sequence; however, little effort has been made to investigate the relationship between cancers and alternative splicing. We developed a program, Alternative Splicing Assembler (ASA), to look for splicing variants of human gene transcripts by genome-wide ESTs alignment. Using ASA, we constructed the biosino alternative splicing database (BASD), which predicted splicing variants for reference sequences from the reference sequence database (RefSeq) and presented them in both graph and text formats. EST clusters that differ from the reference sequences in at least one splicing site were counted as splicing variants. Of 4322 genes screened, 3490 (81%) were observed with at least one alternative splicing variants. To discover the variants associated with cancers, tissue sources of EST sequences were extracted from the UniLib database and ESTs from the same tissue type were counted. These were regarded as the indicators for gene expression level. Using Fisher's exact test, alternative splicing variants, of which EST counts were significantly different between cancer tissues and their counterpart normal tissues, were identified. It was predicted that 2149 variants, or 383 variants after Bonferroni correction, of 26 812 variants were likely tumor-associated. By reverse transcription–PCR, 11 of 13 novel alternative splicing variants and eight of nine variants' tissue specificity were confirmed in hepatocelluar carcinoma and in lung cancer. The possible involvement of alternative splicing in cancer is discussed.