Dear @scRNAseqandscATACseqDataAnalysisDREAMChallengeParticipants,
The scoring scripts for this challenge can be found [here](https://github.com/Sage-Bionetworks-Challenges/Multi-seq-Data-Analysis-DREAM-Challenge-Infra/tree/main/Docker) and the link is also added in the ["Tasks"](https://www.synapse.org/#!Synapse:syn26720920/wiki/620122) wiki page.
As a reminder: Scores will be returned for each valid submission during the Leaderboard Round (ends 2/1 at 23:59 PST). We encourage you to use this opportunity to evaluate your model’s performance in preparation for the Final Round (begins 2/2).
If you have any questions, please do not hesitate to use the Discussion forum to ask them.
Best of luck,
@scRNAseqandscATACseqDataAnalysisDREAMChallengeOrganizers
Created by Rongrong Chai rchai Hi @chentsaimin,
Thanks for the question.
The data for Task 1 is downsampled by two different methods (reads vs cells), which are described in the [Data > Task 1: scRNA-seq > Data preparation for challenge testing and validation](https://www.synapse.org/#!Synapse:syn26720920/wiki/620137) wiki page.
For cell-downsampled case, due to the imbalance cell numbers between imputed and ground truth data, the _pseduobulk_ approach is used to aggregate the counts to gene level.
Hope it helps.
Thank you.
@rchai Thanks for your notifications.
I found that the evaluation metric for two different downsampling methods, i.e. by cells and by reads, is conditioned in the "metrics.R" script as below:
```
if (pseudobulk) {
true_rs <- rowSums(true)
pred_rs <- rowSums(pred)
rmse <- sqrt(mean((true_rs - pred_rs)**2))
if (n_na_genes > 0) rmse <- c(rmse, rep(1, n_na_genes))
if (n_na_cells > 0) rmse <- rmse / (1 - n_na_cells / total_cells)
nrmse <- rmse / (max(true_rs) - min(true_rs))
} else {
rmse <- sqrt(rowMeans((true - pred)**2))
if (n_na_genes > 0) rmse <- c(rmse, rep(1, n_na_genes))
if (n_na_cells > 0) rmse <- rmse / (1 - n_na_cells / total_cells)
range_rr <- matrixStats::rowMaxs(true) - matrixStats::rowMins(true)
nrmse <- rmse / range_rr
}
```
They don't look the same for scoring.
It looks like that one is not just a "mse of a value" but a "mse of a sum of values."
I would like to know the reason why you separated them.
Sincerely yours,
Tsai-Min