Hi @trberg, Thanks for the reply! I wonder if it's possible to release just the evaluation script (or function) to us. Because we would like to make sure that we optimize the model consistently with the official evaluation. It could be a function that takes two arguments (i.e., **the prediction likelihood dataframe and the ground-truth outcome labels**) and produces a series of evaluation scores. This is **really important and helpful** for us, as some of our methods may use reward-based optimization (e.g., minimum Bayes risk). ``` def evaluation(outcome_likelihood, outcome_truth): F-beta = ..... print('F-beta', F-beta) # and more metrics ``` Thanks, Junjie

Hi @junjiehu, We're not putting any weight in F-beta for this challenge. However, in the case of any threshold dependent metric we do look at, I'll be calculating your score along a range of thresholds to find the max score possible and will use that in our evaluation. Hope this helps! Tim

Evaluate metrics: F-betapage is loading…