One participate asked questions about sub challenge 1. I would like to share the conversation here with everyone who is participating this sub-challenge.
######
Hi, XXX,
First, thank you for your interest in the challenging. Although I am not totally clear about your concern of sub-challening 1, I can clarify a few issues here. And I agree that it’s not fair to other participants if we do this discussion “secretly”. So I will paste this email to the forum as well.
1. Sub-challenging 1 is no doubt for a VERY important and pressing problem in current proteomics research. In my opinion, only the result of sub-challenging 1 may have an immediate impact on the current on-going CPTAC projects. So, as a member of CPTAC, I care about this sub-challenging even more than the other two.
2. The simulation model used in sub1 is to mimic the complex missing observed in the real data. While I cannot talk about the detailed simulation settings, I think I can share the following big “principles”:
a. The level of “biological missing” varies across different tissue/disease samples.
b. The biological missing among proteins are correlated.
c. The “biological missing” are mixed with “technique missing” in the final data output.
d. The “technique missing” events are not MAR (missing at random) either. They depend on the intensities to be measured.
3. I don’t think imputing a constant value for each protein would be an optimal strategy. And we are very happy to see that some teams performed pretty well in the round-one Leader Board.
Will be happy to discuss further if you have specific questions.
Best,
-Pei