Submitting Predictions

Created By Lara Mangravite LaraMangravite
Final submissions will be accepted for both Challenges from September 3rd ? September 15th. To qualify for scoring, your final Challenge submissions must be submitted by 11:59pm Pacific on September 15 2013. Each team is eligible to submit a maximum of five submissions to the final round for each Subchallenge. You may participate on multiple teams but each team must have a unique grouping of participants. Final submissions must include a prediction matrix as well as a written description of your methodological approach and the source code used to produce the final submission. Prediction matrices that are submitted without the other two components will be scored but are not eligible to win. For each subchallenge, winners will be selected based on the accuracy of submitted predictions. Submissions will be accepted through Synapse and can be submitted using the web interface or the R or python clients. Directions on how to make final submissions. Predictions can be submitted to Synapse through the web client (here), the R client, or through the python client. In all cases, the submission process follows these steps: Create a project in Synapse where you can store your prediction matrices, code, and methodological summary. See this example. Place your files into your Synapse project. If you are submitting more than one prediction matrix for evaluation, then it must be made clear which code and methodological summary is affiliated with each prediction matrix. This can be done in one of two ways: (1) use provenance to link each prediction matrix to the appropriate code script or (2) place each set of predictions/code/summary into a separate Synapse project or folder. Share your project publicly so that we and other participants will be able to access your code and methodological summary. Submit your prediction file for evaluation. To submit a prediction through the website Use the final template files for either subchallenge 1 or subchallenge 2 to make your prediction and save it on your local computer. Similarly, create a file containing your modeling code and one containing a brief methodological summary. Go to the Synapse homepage (synapse.org), making sure that you are logged in to Synapse. Create a new project (e.g. one called ?Challenge submission 1?). Once your project is created, upload the prediction file, code file, and summary file into your project by clicking the ?upload files? button, locating the file on your local computer, selecting ?Any Use? and clicking the ?upload? button. By clicking ?Any Use?, you are indicating that your file does not contain data that is governed by data use restrictions. These restrictions are usually implemented on data files containing sensitive human data or data that is being shared with the public under specific terms. The data that you downloaded is an example of data that contained data use terms. Prediction matrices, code and methodological summaries do not constitute human data and do not require data use restrictions. The methodological summary can be shared in one of two ways. You can either upload the summary as a file into your project or you can enter the method summary directly into the wiki for your project. For an example, please see the write-up template. If you have stored more than one prediction matrix in the same Synapse project, then you must indicate which code file was used to create each prediction matrix. This can be accomplished using the Synapse provenance function. At this time, provenance can only be implemented through the R or python clients. Please see the Synapse User Guide for more instruction. Alternatively, you can store each prediction/code/summary in a separate project or folder. Share your project with the public. To do this, click on the ?Share? button in the upper righthand corner of your project webpage. Click on the ?Make Public? button and then click ?Save?. You project is now publically viewable. Return to the Challenge website. Click on the submit button. Select the file that contains your prediction. Select the subchallenge of interest. Provide the name of your team. The final five submissions per team will be scored. Click OK to submit your prediction file. You should see a pop-up telling you that your submission was received. Check your email. To submit a prediction using the R or python clients An example project, with sample script for uploading your files and inserting provenance, is available here. Within 5 minutes after submitting a prediction using any of these methods, you will either receive an email confirming your submission or you will receive one indicating that the format of your file does not comply with the expect format for submissions and, as such, could not be scored. If you receive no emails within 30 minutes, please contact us at synapseInfo@sagebase.org. Submissions will not be scored until the Challenge is closed. Final scores will be provided to all participants in a final leaderboard once scoring is completed and a winner is declared. Scoring criteria The exact details of the scoring criteria are described on the final leaderboards (please see table of contents above). Although our initial expectation was to use the root mean square error to rank teams during scoring, our experience during the active phase leaderboard portion of the challenge demonstrated that this was not a meaningful statistic for comparing submissions. As such - and in result of discussions with the participant community through the August webinar - we have adjusted the scoring metrics to more appropriately distinguish the predictive power of these models. Submitting predictions for scoringFor subchallenge 1: Submit predicted EC10 values across all compounds and samples using the file template attached to this page. Separate templates are provided for submissions to the leaderboard and for final submission. The leaderboard is opened for submissions starting on July 25 2013. Final submissions will be accepted through Synapse starting in late August. For subchallenge 2: Submit predicted median EC10, 5th percentile EC10, and 95th percentile EC10 for each unknown compound using the file template attached to this page. Submissions to Subchallenge 2 will also be accepted through Synapse starting in late August. Final SubmissionsFinal submissions will be accepted starting in late August for both Subchallenges. To qualify for scoring, your final Challenge submissions must be submitted by 11:59pm Pacific on September 15 2013. Each team is eligible to submit a maximum of five submissions to the final round for each Subchallenge. You may participate on multiple teams but each team must have a unique grouping of participants. Final submissions must include a prediction matrix as well as a written description of your methodological approach and the source code used to produce the final submission. Prediction matrices that are submitted without the other two components will be scored but are not eligible to win. For subchallenge 1, we will be releasing an extended version of the cytotoxicity training dataset in late August. This will contain the original training data (N=487) + the data used to test predictions submitted to the leaderboard (N=133) for a total final training set containing information from 620 samples. Score your predictions for Subchallenge 1 during the training phase using the Leaderboard! For subchallenge 1, participants have the opportunity to test predictions against a test data set throughout the duration of the Challenge -- and to view the scores of predictions submitted by other teams. Participants can submit any number of submissions to the leaderboard for Subchallenge 1 on behalf of their team. Please take every precaution to avoid over-fitting models to the leaderboard. The submissions to the leaderboard will be scored and ranked solely based on RMSE. Submissions can be made in any of three formats: tab-delimited text, CSV, or RData. In all cases, a 133x106 numeric matrix is required with row labels (cell lines) and column headers (NCGC compound identifiers) as indicated in the template file. The values in the matrix represent predicted EC10 values, as described in the Data Description. The scoring function will accept NAs within a matrix but will score any column containing NAs as the lowest ranked prediction for that compound. Scores will be made publically available on the subchallenge 1 leaderboard through this link. The data used to score predictions for the leaderboard are distinct from the data that will be used to score final predictions and to select the winner. The data used to score the leaderboard will be released in late August for use by participants in developing their final predictions. There is no leaderboard for Subchallenge 2. Directions on how to submit predictions to the leaderboard for Subchallenge 1 Predictions can be submitted to Synapse through the web client (here), the R client, or through the python client. In all cases, the submission process follows these steps: Create a project in Synapse where you can store your prediction matrices and any other information you might like to save. This project will remain private (viewable only by you) until you decide to share it with other Synapse users. Place your prediction file into your Synapse project. Submit that file to the subchallenge 1 leaderboard. To submit a prediction through the website After creating your model using the template file and saving it on your computer: Go to the Synapse homepage (synapse.org), making sure that you are logged in to Synapse. Create a new project (e.g. one called ?Challenge submissions?). Once your project is created, upload the prediction file into your project by clicking the ?upload files? button, locating the file on your local computer, selecting ?Any Use? and clicking the ?upload? button. By clicking ?Any Use?, you are indicating that your file does not contain data that is governed by data use restrictions. These restrictions are usually implemented on data files containing sensitive human data or data that is being shared with the public under specific terms. The data that you downloaded is an example of data that contained data use terms. Prediction matrices do not constitute human data and do not require data use restrictions. Return to the Challenge website. Click on the submit button. Select the file that contains your prediction. Select Subchallenge 1. There is no leaderboard for Subchallenge 2 and we are not currently accepting final submissions to either Subchallenge. Provide the name of your team. The team name will appear on the leaderboard beside your score. Click OK to submit your prediction file. You should see a pop-up telling you that your submission was received. Check your email. Within 5 minutes, you will either receive an email confirming your submission or you will receive one indicating that the format of your file does not comply with the expect format for submissions and, as such, could not be scored. If you receive no emails within 30 minutes, please contact us through the Community Forum or at synapseInfo@sagebase.org. You can view your score here, on the leaderboard. To submit a prediction using the R client A sample script has been provided to assist you in submitting your prediction using the R client. In depth directions on how to use this approach are provided on the webpage from which you can download the sample R script. To submit a prediction using the python client A sample script has been provided to assist you in submitting your prediction using the python client. In depth directions on how to use this approach are provided on the webpage from which you can download the sample python script. Within 5 minutes after submitting a prediction using any of these methods, you will either receive an email confirming your submission or you will receive one indicating that the format of your file does not comply with the expect format for submissions and, as such, could not be scored. If you receive no emails within 30 minutes, please contact us through the Community Forum or at synapseInfo@sagebase.org. Following email confirmation, you can view your score here, on the leaderboard. View scores here, on the Subchallenge 1 leaderboard.

syn1917704
syn1917707
syn1917708