The evaluation description seems vague, could you please provide the evaluation code for computing the chemical clusters?

Created by YIFAN JIANG yfjiang
Hello @yfjiang, this is Luca, one of the authors of the notebook. 1. We have released the basic code that will be used for evaluation on github. The code only describes the method used to calculate the metrics, to obtain the results you need the reference file containing the true labels of the molecules in the test set, and their corresponding cluster. Clusters were calculated based only on the active molecules, not on the entire test set. 2. The example notebook available in the repository should not be considered as an accurate guideline for submission, but rather as a collection of useful code snippets that could be used for the challenge. The code was released to give a baseline for people which might not be familiar with one or more of the described steps (parquet file reading, model training, cross-validation, clustering based on chemical similarity). In the notebook we suggest to run clustering on the top N predicted molecules since this is the typical approach used in virtual screening campaigns. By clustering the molecules and selecting the best ranking one from each cluster it is possible to account for chemical diversity in the selection, but this is not guaranteed to improve the final results. For the challenge one might want to use a different approach for clustering (different algorithm, threshold, top N clustered molecules, or similarity/distance measure) or skip clustering and just select the top N ranking molecules. In the end it is up to you to test and identify the best approach to select a group of diverse and active molecules.
@mschachter Hi, I have several questions: 1. The link above you provided is just an example code for calculating the clusters but not the entire evaluation code. Do you have a plan to release the complete evaluation code? I believe it is at https://github.com/StructuralGenomicsConsortium/Target2035_Aircheck_Utils/tree/main/EvaluationCode ? 2. What is the order of prediction and clustering? In your example code, you do prediction -> clustering -> filtering, which makes no sense. Isn't Clustering -> prediction -> filtering more reasonable? I think an open-source and transparent evaluation process is very important, I would appreciate it if you could provide the evaluation code at your earliest convenience.
Correct.
So is the answer yes, it's the number of clusters that takes precedence, not the number of actives recovered?
Yes, this is in principle a valid point. In practice, given the distribution and composition of the cluster population, a problematic case like the one you presented is highly unlikely to happen. This is why we prefer to use the number of chemically diverse clusters to evaluate predictions which, from a medicinal chemistry standpoint, is a good metric.
So that describes the clustering, but not the evaluation of the submissions. Is it correct that if you find 15 actives from 15 different clusters, that that is evaluated higher than finding 50 actives from 14 different clusters?
Chemical diversity was measured using Tanimoto similarity between ECFP4 binary molecular fingerprints extracted from the provided ECFP4 count fingerprints. Clustering was performed using the agglomerative clustering algorithm with complete linkage. The clustering cut-off distance of 0.32 was selected based on an analysis of the average silhouette score and silhouette scores distributions, as well as visual analysis of the compounds forming each cluster. The example notebook in the provided repository contains indications on how to run clustering using chemical fingerprints. https://github.com/StructuralGenomicsConsortium/Target2035_Aircheck_Utils/tree/main/Notebooks

Evaluation code page is loading…