Combination of Fingerprint-Based Similarity Coefficients Using Data Fusion

Abstract
Many different types of similarity coefficients have been described in the literature. Since different coefficients take into account different characteristics when assessing the degree of similarity between molecules, it is reasonable to combine them to further optimize the measures of similarity between molecules. This paper describes experiments in which data fusion is used to combine several binary similarity coefficients to get an overall estimate of similarity for searching databases of bioactive molecules. The results show that search performances can be improved by combining coefficients with little extra computational cost. However, there is no single combination which gives a consistently high performance for all search types.
Keywords