Mining the National Cancer Institute Anticancer Drug Discovery Database: Cluster Analysis of Ellipticine Analogs with p53-Inverse and Central Nervous System-Selective Patterns of Activity

Abstract
The United States National Cancer Institute conducts an anticancer drug discovery program in which ∼10,000 compounds are screened every yearin vitro against a panel of 60 human cancer cell lines from different organs. To date, ∼62,000 compounds have been tested in the program, and a large amount of information on their activity patterns has been accumulated. For the current study, anticancer activity patterns of 112 ellipticine analogs were analyzed with the use of a hierarchical clustering algorithm. A dramatic coherence between molecular structures and their activity patterns could be seen from the cluster tree: the first subgroup (compounds 1–66) consisted principally of normal ellipticines, whereas the second subgroup (compounds 67–112) consisted principally of N2-alkyl-substituted ellipticiniums. Almost all apparent discrepancies in this clustering were explainable on the basis of chemical transformation to active forms under cell culture conditions. Correlations of activity with p53 status and selective activity against cells of central nervous system origin made this data set of special interest to us. The ellipticiniums, but not the ellipticines, were more potent on average against p53 mutant cells than against p53 wild-type ones (i.e., they seemed to be “p53-inverse”) in this short term assay. This study strongly supports the hypothesis that “fingerprint” patterns of activity in the National Cancer Instituteinvitro cell screening program encode incisive information on the mechanisms of action and other biological behaviors of tested compounds. Insights gained by mining the activity patterns could contribute to our understanding of anticancer drugs and the molecular pharmacology of cancer.