Hi...
I have two questions about Sub challenge 2.
1. We don't have true values(abundance) of missing protein
How can we train and validate without the true values.
2. There are protein genes without missing values
Example, A1BG and A2M protein in breast cancer data have not missing value at any sample.
So just skip these genes on training time?
thanks.
Created by sungsoo park deepimagine1 Hi,
The testing data will contain certain amount of missing data as well. It is your personal choice to select CNA and RNA data with low missing ratio or not for your modeling. The model will be evaluated on a subset of testing proteins with high coverage and low missing rate.
Hi, thanks for helpful question and answer.
I want to ask one more.
If we were given only CNA and RNA data for evaluation, is it right that there is "no missing data (cna and rna of any geneID)" in the test set ?
Thanks. :) Thanks.. Mi
It's clear.
I doubted how to deal with sub2 task because there are many missing and unmatched data in CNA, RNA, PROTEINS.
Then.. Sub 2's test set will have only CNA and RNA data. right?..
Best, Sungsoo
Hi,
There seem to be a misunderstanding.
1. The goal of sub challenge 2 is not to predict missing values. But protein abundance of new patient samples. Just focus on the observed cases.
2. Same as before, if no missing value, then it's good for your model, as you are using more data.
Hope this helps,
Best,
Mi
Drop files to upload
Sub Task 2 : about missing protein values. page is loading…