Comparison of Similarity Method to Improve Retrieval Performance for Chemical Data

Suhaila Zainudin, Nevy Rahmi Nurjana


Drug discovery is the process through which new drugs are discovered. One of the most common techniques in drug discovery is similarity searching based on virtual screening that involves comparing the similarity between molecule structures in chemical database using established similarity methods. The objective of this study is to identify the similarity of the structure in chemical dataset using Mean Pairwise Similarity (MPS) calculation and to determine the best coefficient to be used in similarity searching which involves of molecular descriptor ECFP2 fingerprint and three types of similarity coefficient which are Tanimoto, Soergel and Euclidean. From the results, it was deduced that Tanimoto and Soergel coefficients has a better performance than Euclidean coefficient. For future work, different combinations of fingerprints such as Daylight, BCI, Unity MDL and similarity coefficient can be studied further.


mean pairwise similarity; virtual screening; similarity searching; retrieval; chemoinformatics

Full Text:



  • There are currently no refbacks.

e-ISSN : 2289-2192

For any inquiry regarding our journal please contact our editorial board by email