##### I'd like to know whether my understanding about Subchallenge 1 is correct. ##### About Data * On training step, about Protein_{1..7927} of data_obs_{1..10}.txt and data_true.txt - Each of them indicates same protein {1..7972} on training step. * On prediction step, about Protein_{1..????} of data_obs_{1..100}.txt - Each of them indicates same protein {1..????} on prediction step. - But not corresponding on training step. ###### * On training step, about Sample_{1..80} of data_obs_{1..10}.txt and data_true.txt - These indicate same biological sample on training step. * On prediction step, about Sample_{1..??} of data_obs_{1..100}.txt - These indicate same biological sample on prediction step. - But not corresponding on training step. ###### * On training step, about data_obs_{1..10}.txt - These files were made from data_true.txt, which is single file. - These files imitate technical replicates. * On prediction step, about data_obs_{1..100}.txt - These files were made from data_true.txt, which is single file and not same on training step, of course. - We can't access data_true.txt on prediction step. ###### ##### About Submit * On prediction step, we can only access data_obs_{1..100}.txt. * We will predict data_ture.txt using all these files. * Will we make single file like data_predict.txt, or impute blank and make data_predict_{1..100}.txt? ###### ##### About Score * If we make single file like data_predict.txt, is each protein weighted by the number of originally blank on data_predict_{1..100}.txt? * If we make data_predict_{1..100}.txt, are originally blank values only used to compute score?

Created by Yasuhiro Kambara kambarakun
To Avichai Tendler (Avichai): Yes, you are right.
Your understanding about the data was right. For Submit, you don't have direct access to the testing observed data sets, but the function you submit will be applied on those data set and corresponding score will be obtained systematically. For Scoring, each protein will be weighted equally and only prediction on blank values (NAs) will be used to compute score
Thanks Thomas for answering, I had'nt read 3.5(How to Submit). I have understood files to submit. As Avichai said(thanks to say), I want to know the range of using data allowed.
When we generate predictions of one of the test files data_obs{i}, I assume we are not allowed to use information from other test files such as data_obs{j}. Am I right?
Dear Yasuhiro, Apologies for the delay in response. I can only respond to About Submit and About Score questions. For sc1 you have access to data_obs_{1..100}.txt and you write out a predictions_*.tsv for each file. For information can be found [here](https://www.synapse.org/#!Synapse:syn8228304/wiki/448381) on this page about the prediction file format of each subchallenge. You are not allowed to make just a single file, you have to create prediction files for each obs. Best, Tom

Is my understanding correct about Subchallenge 1? page is loading…