**[Full text of the proposal](https://www.synapse.org/Portal/filehandle?ownerId=syn5659209&ownerType=ENTITY&xsrfToken=1EA1466FCA55F7EAE33833333900F1BC&fileName=Idea1.pdf&preview=false&wikiId=414654)** The authors wish to thank the reviewers for the insightful comments. ###Anonymous Review 1 and Authors Response _ **Impact: ** The proposed model of differentially private regression is an area that is rapidly developing in machine learning, and the measured data has the potential to attract machine learning researchers to problems in oncology. The demonstration of a shared predictor without sharing data would have large impact in the way medical studies are done._ _**Feasibility: ** I like the idea of obtaining more information from an already collected set of samples. This reduces the overall study risk, and the proposed measurement of new features (gene expression) and new labels (drug sensitivity) nicely benefits from already existing data. In this sense, the study seems very feasible._ _**Overall evaluation:** On 300 biobanked samples of patients with Acute Myeloid Leukemia (AML), measure:_ _- drug sensitivity to 525 small molecule inhibitors_ _- RNA sequencing to obtain gene expression._ _The proposed measurements complements other measurements on the same sample, e.g. exome sequencing, and clinical data. Public data is also available on AML on other samples._ _The goal is to demonstrate that a differentially private predictor can be used for drug sensitivity._ _A couple of issues:_ _- The proposal did not provide evidence that gene expression is predictive of drug sensitivity for AML in the non-private setting. I am unfamiliar with the literature, and was wondering whether this is the task with the best signal to noise ratio. Since privacy preserving computation may potentially involve a loss in predictive performance, choosing the task carefully seems prudent._ **Response:** This is a very good observation. Some risk taking is necessary since it is not possible to test the data before collecting them. We highly recommend taking the risk, because of the high expected impact upon success. A feasible contingency plan for the challenge is to run the competition with existing public cell-line data, where possibility of success has already been shown (Honkela et al., 2016) and the task would be to maximally improve the predictions. Some background for choosing AML as the case study: While past efforts in predicting treatment response and outcome for AML patients have primarily focused on cytogenetics, results from recent large scale genomic studies have shown AML to be a complex disease with several hundred genes potentially impacted by mutation. The use of genomic profiles to predict drug sensitivity may therefore be hampered by the diverse mutational spectrum of AML. However, the mutated genes may have redundant roles by affecting the same signalling pathways that may more easily be identified by common gene expression patterns. Thus, AML patients with different mutational backgrounds may share common gene expression profiles, reducing the overall heterogeneity between patients and potentially enabling better prediction for drug response and outcome. The use of gene expression signatures has only recently been applied for AML patient prognostication (Ng et al, 2016; Li et al, 2013). However, there are few studies using gene expression to predict AML drug sensitivity, possibly due to lack of matching functional drug sensitivity data with gene expression profiles. New data sets, however, are becoming available, which show the utility of gene expression profiles in predicting drug response in AML (Kontro et al, 2016).   _ - It is unclear how the project aims to conduct challenges while maintaining privacy of the data. Such an architecture for secure multi-party computation is not widely available, and potential participants would not be able to attempt the challenge without the significant undertaking of constructing the infrastructure._ **Response:** Again a good point. There was not enough space to explain all the details in the proposal, and some details will need more planning. Here is a brief outline: - The challenge will be run over a few iterations to allow learning from others to develop the solutions. - The participants will be given direct access only to mock data (simulated and public data) to test their systems. - Access to the real data will be provided only through provably differentially private interfaces; we will provide some standard interfaces but the participants may also submit new ones which will be peer-reviewed before allowing their use. - The final predictions will be evaluated under suitable metrics (using ones from previous DREAM challenges) under varying levels of privacy. Only the final assessment results will be released to prevent leaking private data. (Controlling the amount of leaked information from repeated releases from multiple mechanisms under different levels of privacy would be essentially impossible.)

Created by Chloé-Agathe Azencott caz
Let's see if I understood this challenge right. the challenge is to find a function that checks the drugs versus a bunch of different parameters. And then to optimize that function so that the most effective drug can be found for every single RNA mutation.
###Anonymous Review 2 and Authors Response _ **Impact: ** The topic if this Idea is highly relevant to the field of precision medicine. As the authors clearly state, one will probably never be able to gather enough patient data in a centralized database to tackle all diseases, especially rare diseases. Sadly, one of the major hurdles lies in the complexity of sharing private patient data across hospitals and research institutions. Although the technologies are available, the politics usually go in the way, making centralized databases a dream more than a reality._ _The concept of data sharing based on differential privacy is now emerging as a promising approach to train models on data that never leave their private silo. However, current machine learning methods must be adapted to be applicable in this setting._ _The authors propose to leverage the framework of differential privacy for drug response prediction in AML. Would this proof of concept be successful, this project would offer new opportunities for future DREAM challenges._ _**Feasibility: ** Samples are secured and budget is reasonable. However, the authors' preliminary results do not convincingly demonstrate that 300 patient-derived cell lines tested with ~500 drugs will be enough to develop efficient predictive models. The noise in the drug sensitivity assays will also be an issue. Intra tumor heterogeneity (ITH) should be discussed (how much ITH is to be expected in AML? Is it well recapitulated in freshly derived cell lines?). Which are the drugs used to treated the 300 AML patients and are the outcome recorded for validation?_ **Response:** With differential privacy, the effect of requiring privacy will decrease as the data set size increases, so the most important question is how low sample sizes can still be useful, and how efficiently the small data can be used. That is what we intend to test. Another related question is how much can performance can at best be improved given more data, but for testing that question a very large budget would be needed. While ITH is a challenge, the impact of ITH in this study can be reduced by focusing on AML samples with high tumor content (e.g. leukemic blast cell content > 50%), which is recorded at the time of sampling. This will reduce the noise introduced by normal cell content in the drug sensitivity and gene expression data. The 525 drug panel is comprised of both approved oncology drugs (30%) and investigational drugs (70%). The approved drugs include the standard of care for AML patients (cytarabine plus an anthracycline) as well as drugs that may be available to the patients through clinical investigations. Clinical response to these drugs is recorded in the patient registry of which data will be available for validation.   _**Overall evaluation:** This is a very exciting proposal although important aspects of the feasibility should be discussed (see comments above)_

Idea 1: Private precision medicine page is loading…