Hi, We are implementing the scoring function to test our different approaches against the data_true.txt of subchallenge 1: $$\(NRMSE = \frac{\sqrt{(\sum_{i=1}^{n_{missing}} (y_i - x_i)^2)/n_{missing} }}{y_{max}-y_{min}}\)$$ Considering that we have at least one zero in almost all protein for every sample in data_true.txt, and that the function is applied on each protein, then the $$\(y_{min}\)$$ will be almost always 0. Which is equivalent to only divide by $$\(y_{max}\)$$. In your scoring code, do you exclude those zeros ? Thank you

Created by Mickael Leclercq mickael
you need to take average on the square prediction error of missing spot only, including both non-zeros and zeros.
Hello All, From above, I understand y_max and y_min values for each proteins. Suppose, I want to calculate protein_1, NRMSE value of data_obs_1.txt. Where, yi : i = 1,...80 True values of data_true.txt and xi : i =1,..,80 Computed values of data_obs_1.txt Do I need to consider those imputed values which has corresponding zeros in data_true.txt while calculating NRMSE or I need to consider only those imputed values which have corresponding nonzero true data values?
Yes, you are right.
For one protein, the y~max~ and y~min~ should be based on true values (positive) from **all samples**, instead of just true values (positive) from **samples with NAs**. In other words, the y in the denominator represents the full set of values, and y in the numerator is a subset. Is my understanding correct?
Please find the definition of all those terms in the document of scoring matrix. And I cut and paste the related sentence here just in case you have missed some part: 'ymax and ymin are maximum and minimum of the same protein among all samples which the true intensities have **positive** value.'

Scoring function with zeros values as normalization page is loading…