Whole-cell parameter estimation DREAM challenge

Description of the 2013 whole-cell parameter estimation DREAM challenge including a synopsis of the challenge, a summary of the whole-cell model, and participant instructions.
Created By Brian Bot BrianMBot
Sponsored by: Dialogue for Reverse Engineering Assessments and Methods (DREAM) Sage Bionetworks IBM Research Covert Lab, Stanford University Numerate Mathworks News Oct 15 Best performers announced here Sep 16 Deadline for final submissions is Sep 23. Sub-challenge #3 results posted to the leader board. Congrats to team The ICM Poland! Sep 4 Sub-challenge #2 results posted to the leader board. Congrats to team The Whole-Sale Modelers! Aug 20 Sub-challenge #1 results posted to the leader board. Congrats to teams crux, The Whole-Sale Modelers, and newDream! Aug 20 BitMill limit increased to five (5) simultaneous simulations. Aug 1 Intermediate subchallenges and prizes announced! See Section 2.6 for more info about submitting Aug 9: Milestone 1 -- Best prediction score. 1st place: $200; 2nd place: PLoS gear, MATLAB student license Aug 23: Milestone 2 -- Most creative method. 1st place: $300; 2nd place: PLoS gear, MATLAB student license Sep 6: Milestone 3 -- Best parameter score. 1st place: $400; 2nd place: PLoS gear, MATLAB student license Sep 23: Final -- Best combined score. 1st place: ISCB/RECOMB, PLoS invitations July 11 Added support for HTTP proxies. Pull latest code and edit getConfig.m to setup proxy for BitMill. July 10 Feedback survey posted. Please give us feedback on how the challenge is going! July 10 Webinar slides and video posted July 9 Added metabolic sub-model linear program files for use with programs like libSBML, openCOBRA, lpsolve, and gurobi July 9 Added instructions on how to set enzyme copy numbers in metabolic sub-model (See Section 3.4.4) July 3 The first webinar will held July 9 at 11am PDT. Register here June 21 PLoS Computational Biology will publish manuscripts from the winning participants 1 SynopsisRecently while tinkering in the lab, we (the organizers) made an exciting breakthrough! Incredibly, we identified a mutant in silico strain of the Gram-positive bacterium Mycoplasma genitalium which grows 33% slower than wild-type! As a participant, we challenge you to determine how the mutant strain differs from the wild-type and why it grows more slowly. Specifically, the organizers have changed the values of 15 parameters of a recent whole-cell model of M. genitalium (Karr et al., 2012) compared to that of the wild-type strain. These 15 parameters along with 15 unmodified parameters are listed in Table 1 (see Section 2.3). Your goal is to identify which 15 parameters we modified as well as their new values, given the model's structure, the wild-type parameter values which are distributed with the model code, and data obtained from in silico experiments on the mutant strain. The challenge mimics a common scenario in scientific research where researchers need to tune a model's parameters to match experimental data in order to discover new biology. The goal of this challenge is to explore and compare innovative approaches to parameter estimation of large, heterogeneous computational models. Participants are encouraged to develop and/or apply optimization methods, including the selection of the most informative experiments. The organizers encourage participants to form teams to collaboratively solve the challenge. 1.1 BackgroundA central challenge in biology is to understand how phenotype arises from genotype. Despite decades of research which have produced vast amounts of biological data, a complete, predictive understanding of biological behavior remains elusive. Computational techniques are needed to assemble the rapidly growing amount of biological data into a unified understanding. Recently, researchers at Stanford University developed the first comprehensive dynamical "whole-cell" model of a living organism (Karr et al., 2012). The model broadly predicts the cell cycle dynamics of the Gram-positive bacterium M. genitalium from the level of individual molecules and their interactions, including its metabolism, transcription, translation, and replication. The model is composed of 28 sub-models of distinct cellular processes which were independently modeled at short time scales, and integrated together at longer time scales Figure 1. The model was validated by broadly comparing its predictions to a wide range of experimental data across several biological processes and scales. Figure 1. M. genitalium whole-cell model. Diagram schematically depicts the 28 sub-models (colored words) in the context of a single M. genitalium cell with its characteristic flask-like shape. Sub-models are connected through common metabolites, RNA, protein, and the chromosome, which are depicted as orange, green, blue, and red arrows, respectively. Reprinted from Karr et al., 2012. In total the M. genitalium whole-cell model contains 1,468 quantitative parameters. Accurately identifying these parameters is essential to whole-cell modeling. Furthermore, identifying these parameters is challenging because the model takes approximately 24 core-hr to simulate one cell cycle and because the model is stochastic. Karr and his colleagues identified the model parameters by first assembling a training set of over 1,836 experimental observations from over 900 publications, and second heuristically tuning the parameter values to match the training data. More rigorous approaches to parameter estimation are critically needed to improve the accuracy of whole-cell models and to enable researchers to continue to develop increasingly complex models. 1.2 Free cloud computational resourcesParticipants can simulate the whole-cell model with candidate parameter values for free in the cloud using BitMill. BitMill was generously adapted and donated to run the Dream challenge by Numerate. See Section 3.3.3 for instructions. 1.3 Prizes 1.3.1 Final challenge (due Sep 23, 2013)The team with the highest overall scoring solution will be invited to present their approach at the 6th Annual RECOMB/ISCB conference on Regulatory and Systems Genomics in Toronto, Canada. The winning team will also be invited to submit a manuscript describing their methodology to PLoS Computational Biology, the leading computational biology journal. The manuscript will be invited, but will still be subject to the same rigorous peer review as all PLoS Computational Biology articles. See Sections 2.6 and 2.7 for information about submission and scoring. 1.3.2 Intermediate sub-challenges (due Aug 9, Aug 23, Sept 6) We will award prizes for three intermediate sub-challenges: Aug 9: Milestone 1 -- Best prediction score 1st place: $200 2nd place: PLoS gear, MATLAB student license Aug 23: Milestone 2 -- Most creative method. Challenge organizers will judge the submitted write-ups and code. 1st place: $300 2nd place: PLoS gear, MATLAB student license Sep 6: Milestone 3 -- Best parameter score 1st place: $400 2nd place: PLoS gear, MATLAB student license See Sections 2.6 and 2.7 for information about submission and scoring. 1.4 Timeline Jun 10, 2013: Challenge launched July 9, 11am PST/ 2pm EST: Live webinar with challenge organizers. View slides. Video will be posted soon! Intermediate submission deadlines Aug 9 11:59pm PST: Milestone 1 -- Best prediction score. 1st place: $200; 2nd place: PLoS gear Aug 23 11:59pm PST: Milestone 2 -- Most creative method. 1st place: $300; 2nd place: PLoS gear Sep 6 11:59pm PST: Milestone 3 -- Best parameter score. 1st place: $400; 2nd place: PLoS gear Sep 23, 2013 11:59pm PST: Final solution submission deadline. 1st place: ISCB/RECOMB, PLoS invitations Late Sep, 2013: Winners announced Nov 8-12, 2013: Winning team presents their solutions at the 6th Annual RECOMB/ISCB conference on Regulatory and Systems Genomics in Toronto, Canada Spring, 2014: Winning solution and meta-analysis published 2 The challengeParticipants are challenged to estimate the values of 15 unknown parameter values from a set of 30 – 10 promoter affinities, 10 RNA half lives, and 10 metabolic reaction kcats – of a recently published whole-cell model of M. genitalium (Karr et al., 2012) given the model's structure and simulated data. The 30 unknown parameters are associated with 10 mRNA-coding genes whose gene products catalyze 10 metabolic reactions. Specifically, the organizers have modified the values of 15 of these parameters compared to their values published in Karr et al., 2012 and distributed to participants through GitHub. Together the modified parameter values increase the average in silico cell cycle length by 33% from 9 to 12 h. The organizers have not modified the values of the other 15 parameters. Participants will not be told which 15 parameters have been modified. Rather participants are challenged to learn this information. Participants are encouraged to develop and/or apply optimization methods, including the selection of the most informative experiments. Participants will be scored based on the distance between their estimated and the true parameters values and the distance between the in silico measurements from their estimated and the true parameter values. Below we describe the 30 parameters, the in silico data, the submission system, and the scoring algorithm. 2.1 The whole-cell model The whole-cell model is composed of 28 sub-models (also refereed to as modules or processes) each of which was modeled independently at short time scales using different mathematical representations (e.g. ODEs, Boolean, probabilistic, constraint-based, etc.). As illustrated in Figure 2, the model integrates the sub-models in three steps. First, the sub-models are structurally integrated by linking their common inputs and outputs through 16 common cell state variables which together represent the in silico cell's instantaneous configuration: Metabolite, RNA, and protein copy numbers; Metabolic reaction fluxes; Nascent DNA, RNA, and protein polymers; Molecular machines; Cell mass, volume, and shape; The external environment, including the host urogenital epithelium; and Time. Second, the common inputs to the sub-models were computationally allocated at the beginning of each time step. Third, values of the sub-model parameters were semi-automatically tuned to match experimental data. The whole-cell model is extensively described in Data S1 of Karr et al., 2012. Chapter 1 summarizes the model. Chapters 2 and 3 describe the mathematical and computational implementation of each cell state variable and process sub-model. Figure 2. Whole-cell model simulation algorithm. The model integrates cellular function sub-models through 16 cell variables. First, simulations are randomly initialized to the beginning of the cell cycle (left gray arrow). Next, for each 1 s time step (dark black arrows), the sub-models retrieve the current values of the cellular variables, calculate their contributions to the temporal evolution of the cell variables, and update the values of the cellular variables. This is repeated thousands of times during the course of each simulation. For clarity, cell functions and variables are grouped into five physiologic categories: DNA (red), RNA (green), protein (blue), metabolite (orange), and other (black). Colored lines between the variables and sub-models indicate the cell variables predicted by each sub-model. The number of genes associated with each sub-model is indicated in parentheses. Finally, simulations are terminated upon cell division when the septum diameter equals zero (right gray arrow). Reprinted from Karr et al., 2012. 2.1.1 Metabolic sub-modelWe encourage participants to solve the challenge by using the metabolic submodel as a simplified surrogate of the entire whole-cell model. The metabolic sub-model (Figure 3) includes the 10 reactions associated with the 30 parameters, including the 15 unknown ones. The metabolic sub-model was implemented using flux-balance analysis. See Chapter 3 of Data S1 of Karr et al., 2012 for further information about the metabolic sub-model. See section 3.4.1 for instruction on how to simulate the metabolic sub-model. Figure 3. M. genitalium metabolic network. Reprinted from Karr et al., 2012. 2.2 Model parametersThe model contains a large number of quantitative and structural parameters. However, participants are only asked to estimate the values of 15 of these parameters from the subset of 30 parameters indicated above: 10 promoter affinities, 10 RNA half-lives, and 10 metabolic reaction kcats. The next section contains more information about the unknown parameters including how to set their values. A table of all of the model's parameters including their biological meaning, value, and units is available here. Additionally, participants can use WholeCellKB to inspect the experimental data used to train the base value of each of the model's parameters. Figure 4 displays a screen shot of a representative WholeCellKB page of the thiamine kinase reaction. Participants can click on the "View in model" buttons highlighted in red to view the corresponding model properties highlighted in the MATLAB simulation code (Figure 5). Figure 4. Screen shot of WholeCellKB highlighting the "View in model" button. Participants can use this button to inspect how the base values of the model's parameters (highlighted in MATLAB code in Figure 5) are trained using the experimental data organized in WholeCellKB including the reported reaction stoichiometry listed in this screen shot. Figure 5. Screen shot of a WholeCellKB "View in model" page. The page highlights the model property or properties associated with the experimental data named at the top of the page (e.g. Reaction: Stoichiometry). 2.3 Model parameters to be estimatedParticipants are asked to estimate 30 unknown parameters of 3 types (Table 1): 10 promoter affinities, 10 RNA half-lives, and 10 metabolic reaction kinetics (kcats) Table 1. Unknown RNA half-lives and reaction kcats to be estimated. Each row lists a gene, the operon the gene is transcribed with and its RNA half-live in seconds, and a reaction catalyzed by the gene's protein product and its forward kcat. The organizers have modified the values of 15 of the 30 quantitative parameters (operon affinity, RNA half life, reaction kcat) in this table. Participants are challenged to determine which parameters have been modified and to estimate their new values. The values listed in the table are the base values of the parameters published in Karr et al., 2012. Gene ID Operon ID Operon half life (s) Enzyme ID Reaction ID Reaction kcat (1/s) MG_006 TU_003 209 MG_006_DIMER Tmk 0.07 MG_023 TU_011 245 MG_023_DIMER Fba 23.34 MG_047 TU_027 170 MG_047_TETRAMER MetK 0.11 MG_111 TU_069 187 MG_111_DIMER Pgi 1218.79 MG_272 TU_180 401 MG_271_272_273_274_192MER AceE 1128.31 MG_299 TU_203 174 MG_299_DIMER Pta 1620.91 MG_330 TU_233 253 MG_330_MONOMER CmkA2 103.25 MG_357 TU_260 216 MG_357_DIMER AckA 100.59 MG_407 TU_294 282 MG_407_DIMER Eno 300.87 MG_431 TU_307 219 MG_431_DIMER TpiA 816.67 Table 2. Unknown operon affinity/RNA polymerase binding probability to be estimated. The perturbation labels were corrected on August 28. This update does not change the perturbations that were made. The update only corrects the labels. Previously the modified parameters were misreported due to an error in the code. The first column of the table below indicates the modified RNA polymerase affinity/promoter affinities. The second and third columns indicate the prior (incorrect) label for each perturbation. Note: The purchasable perturbation data still uses the incorrect labels for the purchasing options and file names. Please use the table below to interpret the true perturbations. Operon ID Original perturbation file label (TU) Original perturbation file label (Gene) TU_003 TU_003 MG_006 TU_012 TU_011 MG_023 TU_028 TU_027 MG_047 TU_070 TU_069 MG_111 TU_184 TU_180 MG_272 TU_209 TU_203 MG_299 TU_245 TU_233 MG_330 TU_272 TU_260 MG_357 TU_306 TU_294 MG_407 TU_319 TU_307 MG_431 The 30 unknown parameters are associated with 10 mRNA-coding genes (3 parameters per gene). Of these 30 parameters, the values of 15 have been changed compared to distributed base parameter values. Together the modified parameter values increase the average in silico cell cycle length by 45% from 9 to 12 h. The values of the other 15 parameters have not been modified. Participants will not be told which 15 parameters have been modified. Rather participants are challenged to learn this information. In case this turns out to be too difficult, the organizers will reveal the identity of the 15 modified parameters. In summary, 15 parameters were modified: 3 promoter affinities 3 RNA half lives 9 kcats The values of the 13 of the 15 modified parameters were decreased. The values of 2 of the 15 modified parameters were increased. The decreases range from 2.8-93.4%. The increases range from 11.7-90.6%. 2.3.1 Promoter affinitiesThe promoter affinities are represented by the transcriptionUnitBindingProbabilities of the edu.stanford.covert.cell.sim.process.Transcription class. This property is a 335×1 numeric array. Because the entries represent probabilities their values are dimensionless and sum to 1. Each row corresponds to a transcription unit (also known as nascent RNA, operon, polycistronic RNA). The row labels and probabilities can be retrieved by evaluating rna = sim.state('Rna'); trn = sim.process('Transcription'); ids = rna.wholeCellModelIDs(rna.nascentIndexs); probsArr = trn.transcriptionUnitBindingProbabilities; probsStruct = sim.getRnaPolTuBindingProbs(); sim.applyRnaPolTuBindingProbs(probsStruct); Note: getRnaPolTuBindingProbs will automatically renormalize transcriptionUnitBindingProbabilities. 2.3.2 RNA half livesThe RNA half lives (s) are represented by the halfLives property of the edu.stanford.covert.cell.sim.state.Rna class. The property is a 2428×1 numeric array. Each row corresponds to a distinct RNA species. Although there are only 335 transcription units which are cleaved (also known as processed) into 347 mature RNA species, the property has length 2428 because its represents all of the forms (nascent, processed, mature, bound, misfolded, damaged, aminoacylated, intergenic) of each RNA gene product. The unknown RNA half lives all correspond to mature forms. The mature RNA half lives and row labels can be retrieved and modified by evaluating rna = sim.state('Rna'); ids = rna.wholeCellModelIDs(rna.matureIndexs); halfLivesArr = rna.halfLives(rna.matureIndexs); halfLivesStruct = sim.getRnaHalfLives(); sim.applyRnaHalfLives(halfLivesStruct); Note: The half lives of the processed, mature, and aminoacylated forms of each RNA species are constrained to be equal. The nascent half lives are equal to the average half lives of the component mature RNA species. The bound forms are constrained to have infinite half lives. The misfolded, damaged, and intergenic forms are constrained to have half lives equal to zero. The applyRnaHalfLives method automatically satisfies these constraints by updating the half lives of the processed, aminoacylated, and nascent forms in addition to the mature form. 2.3.3 Metabolic reaction kcatsThe metabolic reaction kcats (reactions/enzyme/s) are represented by the enzymeBounds property of the edu.stanford.covert.cell.sim.process.Metabolism class. The property is a 645×2 numeric array. Each row corresponds to a reaction. The first column corresponds to the reverse kcats. The second column represents the forward kcats. The row labels and forward kcats can be retrieved by evaluating met = sim.process('Metabolism'); ids = met.reactionWholeCellModelIDs; kcatsArr = met.enzymeBounds; kcatsStruct = sim.getMetabolicReactionKinetics(); sim.applyMetabolicReactionKinetics(kcatsStruct); Note: The unknown kcat parameter values all represent forward reactions. Note: the kcats are redundantly represented by fbaEnzymeBounds property of the same class. Use the applyMetabolicReactionKinetics method of the edu.stanford.covert.cell.sim.Simulation class to set kcat values. Do not edit the enzymeBounds or fbaEnzymeBounds properties directly. 2.4 In silico "experimental" data for parameter estimationParticipants can use the eight data types below to estimate the 15 modified parameters of the 30 parameters listed above: Single-cell data Dynamics: rows correspond to individual cells, columns correspond to time points 0..N (s). Growth (g/s) Mass (g) Volume (L) Note: the arrays which store the growth, mass, and volume data are NaN padded in the following way. Let Aij be the single-cell measurement of cell i at time point j. Then Aij is NaN when in silico cell i divided before time point j. Event times: rows correspond to individual cells. Replication initiation time (s) Replication termination time (s) Cell cycle length (s) Note: NaN values indicate that the in silico cell didn't complete the event. For example the cell cycle length data for cell i will be NaN if cell i didn't divide within the 65,000 s simulation. Metabolite concentrations (M): Time and population average concentrations. Rows correspond to metabolite species. Rows are labeled by sim.state('Metabolite').wholeCellModelIDs. DNA-seq (DNA molecules/nt): Time and population average DNA copy number of each 100 nt region of the chromosome. Row 1 corresponds to bases 1..100, Rows 2 corresponds to bases 101..200, etc. RNA-seq (transcripts/nt): Time and population average number of mapped RNA transcripts of each 100 nt region of the chromosome. Row 1 corresponds to bases 1..100, row 2 corresponds to bases 101..200, etc. ChIP-seq (protein molecules/nt): Time and population average DNA-bound protein density of each 100 nt region of the chromosome. Row 1 corresponds to bases 1..100, row 2 corresponds to bases 101..200, etc. Columns corresponds to mRNA-coding genes and are labeled by sim.gene.wholeCellModelIDs(sim.gene.mRNAIndexs). RNA expression array (M): Time and population average RNA concentrations by gene. Rows correspond to genes. Rows are labeled by sim.gene.wholeCellModelIDs. Protein expression array (M): Time and population average protein concentrations. Rows correspond to protein-coding genes. Rows are labeled by sim.gene.wholeCellModelIDs(sim.gene.mRNAIndexs). Metabolic reaction fluxes (rxn/s/gDCW): Time and population average reaction fluxes. Rows are labeled by sim.state('MetabolicReaction').reactionWholeCellModelIDs. 2.4.1 Initial wild-type data provided "free" to participantsThe organizers have performed the eight in silico experiments described on a population of 32 in silico cells using the 15 modified parameter values (compared to the base parameters values). Participants can download this data for "free" here. 2.4.2 Perturbation data available for "purchase"To reflect conditions that exist in actual scientific practice, each individual participant will also receive 5,000 credits which they can use to "buy" additional "experimental" data measured from perturbed in silico cells. The organizers created the perturbation data sets by individually increasing and decreasing the values of the 30 unknown parameters (see Table 1) by a factor of 2 compared to the unknown values. Participants can purchase the eight data types described above for each perturbation. In total 480 datasets are available for purchase (30 parameters × 2 perturbations × 8 data types). Each data set represents the average of eight in silico cells. Each data set costs 100 credits. Participants can form teams to pool perturbation data. Participants must use the "purchase" form to obtain additional data. Participants will purchase in silico data individually, not as teams. Perturbation parameter sets were calculated using the following code: genesTusRxns = { 'MG_006' 'TU_003' 'Tmk' 'MG_023' 'TU_011' 'Fba' 'MG_047' 'TU_027' 'MetK' 'MG_111' 'TU_069' 'Pgi' 'MG_272' 'TU_180' 'AceE' 'MG_299' 'TU_203' 'Pta' 'MG_330' 'TU_233' 'CmkA2' 'MG_357' 'TU_260' 'AckA' 'MG_407' 'TU_294' 'Eno' 'MG_431' 'TU_307' 'TpiA' }; parameterTypes = { 'PromAffinity' 'HalfLife' 'RxnKcat' }; parameterVals = { '05X' 0.5 '2X' 2.0 }; sim = edu.stanford.covert.cell.sim.util.CachedSimulationObjectUtil.load(); goldParameters = load(fullfile(baseDir, 'gold-parameters.mat')); sim.applyAllParameters(goldParameters); rnaPolTuBindingProbs = sim.getRnaPolTuBindingProbs(); rnaHalfLives = sim.getRnaHalfLives(); rxnKinetics = sim.getMetabolicReactionKinetics(); for i = 1:size(genesTusRxns, 1) for j = 1:numel(parameterTypes) for k = 1:size(parameterVals, 1) sim.applyAllParameters(goldParameters); switch parameterTypes{j} case 'PromAffinity' tuId = genesTusRxns{i, 2}; sim.applyRnaPolTuBindingProbs(struct(tuId, parameterVals{k, 2} * rnaPolTuBindingProbs.(tuId))); case 'HalfLife' tuId = genesTusRxns{i, 2}; sim.applyRnaHalfLives(struct(tuId, parameterVals{k, 2} * rnaHalfLives.(tuId))); case 'RxnKcat' rxnId = genesTusRxns{i, 3}; sim.applyMetabolicReactionKinetics(struct(rxnId, struct('for', parameterVals{k, 2} * rxnKinetics.(rxnId).for))); end parameters = sim.getAllParameters(); paramFileName = fullfile(baseDir, sprintf('parameters_%s_%s_%s.mat', genesTusRxns{i, 1}, parameterTypes{j}, parameterVals{k, 1})); save(paramFileName, '-struct', 'parameters'); end end end Participants can simulate these same perturbations themselves (see Section 3.3). However, participants will not know the true parameter values from which to base the perturbations. 2.4.3 Data access agreementParticipants will be required to accept the Data access agreement to obtain the challenge data. 2.5 Registering for the challengeParticipants must join to obtain free access to the cloud computing resources and to submit solutions. Participants will be required to create or join a team to complete the registration. After registering, participants can change their team affiliation at any time. Note: creating multiple accounts or teams solely to circumvent limits on the "purchased" in silico data is grounds for disqualification. 2.5.1 TeamsParticipants are encouraged to solve the challenge in teams. Teams can be of any size, and teams are responsible for managing their membership and distributing any prizes among members. Participants can use the forum to recruit team members. Parameter sets can be submitted by any team member. Only participant per team needs to submit their write-up and code. 2.6 Submitting solutions, code, & write-upsAfter registering, teams will be able to submit their solution in two parts: (1) the estimated parameter values and (2) code and a 1-2 page write-up describing the methods they used to solve the challenge and all code used to solve the challenge. When teams submit their write-up they will also be required to accept a statement of participation acknowledging that their submission represents their own work. The code and write-up should be submitted by one participant per team. Participants must submit their estimated parameter values using the MATLAB script postCloudSimulation. This script will (1) simulate eight in silico cells in the cloud using BitMill, (2) return to participants their average in silico measurements, (3) return to participants the distance between the true and estimated parameter values and the distance between the in silico data from the true and estimated parameter values, and (4) for debugging purposes, return to participants the standard outputs and errors of their simulation job concatenated into two files (output and error). Weekly the organizers will separately rank participants by these two distances, and post rankings on the leader board. The two distances will be combined to form an overall score which will be used to award prizes. Prizes will be awarded to teams based on the highest overall scoring set of parameter values from among all team members. See Sections 2.4, 2.7, and 3.3.3 for more information about how the in silico "experimental" data is calculated, how the distances are computed and scored, and how to run simulations in the cloud. One participant per team must submit their write-up and code using the procedure described in the "submission tutorial" above. It is up to teams to coordinate their code and write-ups. Create a new Synapse project for each submission Upload a brief short write-up (1-2 pages in text, word, or pdf format) to the project. Write-ups can be informal and may contain pseudo-code describing the algorithm(s) used, work flows, etc. Upload any code used the solve the challenge to the project Submit your project to the challenge. To encourage participants to build on the best performing methods the organizers will post the leading write-ups and code after each milestone. The organizers will also use the write-ups and code to report the most successful parameter estimation strategies in a summary paper to be published in Spring 2014. Note: Sage Bionetworks reserves the right to disqualify submissions from any participant or team at its sole discretion. 2.7 ScoringSubmissions will be scored according to two criteria: (1) the distance between the true and estimated parameter values and (2) the distance between the in silico data from the true and estimated parameter values. These criterion will be combined to form a single overall score. The distance and scoring calculations are implemented by the MATLAB class edu.stanford.covert.cell.sim.util.DreamScoring. The test_calcParameterAndPredictionScoring method of the edu.stanford.covert.cell.sim.DreamCompetitionTest class illustrates exactly how simulations will be run and scored. See below for information about how the individual distances and overall score are calculated. See Section 2.4 for more information about how the in silico "experimental" data is calculated. 2.7.1 Parameter distanceThe parameter distance will be calculated as follows. Let v_i{true} and v_i{est} be the true and estimated values of the parameter including all of the unknown parameters (promoter affinities, RNA half-lives, and metabolic reaction kinetics). Then the parameter distance is given by where N is the number of parameters. 2.7.2 Prediction distanceThe prediction distance will be calculated as follows. Let v_i{true} and v_i{est} be the in silico measurements obtained using the true and estimated parameter values, including the in silico single-cell, DNA-seq, RNA-seq, ChIP-seq, metabolomics, transcriptomics, and proteomics data. Then the prediction distance is given by where N is the number of in silico measurements and σ_i^{true} is the standard deviation of each in silico measurement under the true parameter values. 2.7.3 Overall score The prediction and parameter distances will be combined as follows. Let p_{param} and p_{pred} be the p-values of the parameter and prediction distances obtained by empirical sampling all of the submissions. Specifically, we form an empirical distribution of the parameter distance by (1) sampling the value of each parameter separately from all of the submissions with uniform probability and (2) calculating the distance between true and randomly chosen parameter values. Similarly, we form an empirical distribution of the prediction distance by (1) sampling the value of each in silico measurement separately from all of the submissions with uniform probability and (2) calculating the distance between the true and randomly chosen in silico measurements. Then the overall score is given by 2.7.4 Most creative methodThe organizers will award the prize for the most creative method for second milestone based on judging your submitted write-ups and code. 2.8 Leader boardWeekly the organizers will separately rank individual participants by the two distances described above, and post rankings on the leader board. The leader board will not rank teams, however prizes will be awarded based on teams. 3 Using the whole-cell modelHere we describe how to install and run the whole-cell model. The whole-cell model is described extensively in Data S1 of Karr et al., 2012. See Section 3.5 for additional information about the whole-cell model and its computational implementation. 3.1 Installing the whole-cell model and required softwareThe following sections describe how to install the whole-cell model software on Linux, Mac, and/or Windows. Alternatively, participants can use the whole-cell virtual machine which already contains all of the required software. See Section 3.1.4 for more information. Note: participants must join the challenge to obtain the accounts, passwords, and keys needed to install the whole-cell model and cloud computing software. After creating a Synapse account and joining the challenge you will receive an email from bitmill-support@numerate.com notifying you that we have created an Amazon IAM account, an Amazon S3 bucket, and a BitMill account for you. At this point you can complete the installation instructions below. 3.1.1 Linux/UnixNote: the following instructions were developed using Linux Mint 14 Install MATLAB ≥ 2009 with the following toolboxes: Bioinformatics Curve fitting Image processing Optimization Signal processing Statistics Note: The Statistics toolbox is the only toolbox required to simulate the model. The other toolboxes are needed to construct the Simulation object and run some of the model analysis. Install packages sudo apt-get install git sudo apt-get install curl sudo apt-get install python python-setuptools python-pip sudo pip install python-magic Install s3cmd wget http://downloads.sourceforge.net/project/s3tools/s3cmd/1.5.0-alpha1/s3cmd-1.5.0-alpha1.tar.gz?r=&ts=1369441316&use_mirror=superb-dca2 tar -xvvf s3cmd-1.5.0-alpha1.tar.gz?r= cd s3cmd-1.5.0-alpha1 sudo python setup.py install cd .. rm -rf s3cmd-1.5.0-alpha1 rm s3cmd-1.5.0-alpha1.tar.gz?r= Configure s3cmd by executing python s3cmd --configure s3://<your_bucket_name> Download your access and secret keys here Enter access and secret keys provided at registration (you will receive an email from bitmill-support@numerate.com, it might take some time) Leave "encryption password" and "Path to GPG program" blank Set "Use HTTPS protocol" to "Yes" Leave "HTTP Proxy server name" blank Yes, test access. This should result in a message "... Success" Yes, save settings Install bitmill-bash git clone https://github.com/Numerate/bitmill-bash.git ~/bitmill-bash cd ~/bitmill-bash rm bitmill.conf s3cmd get s3://<your_bucket_name>/bitmill.conf bitmill.conf #replace <your_bucket_name> with your bucket's name ./gen_all_scripts.sh Install and configure whole-cell model code git clone -b parameter-estimation-DREAM-challenge-2013 https://github.com/CovertLab/WholeCell.git ~/WholeCell matlab >> cd /path/to/WholeCell matlab >> install(); Configure path Bash shells: append to ~/.bashrc. Create file if necessary. export PATH=$PATH:~/bitmill-bash:~/bitmill-bash/dream csh shells: append to ~/.cshrc. Create file if necessary. set PATH = ($PATH ~/bitmill-bash ~/bitmill-bash/dream) tsch shells: append to ~/.tcshrc. Create file if necessary. set PATH = ($PATH ~/bitmill-bash ~/bitmill-bash/dream) 3.1.2 Mac Install MATLAB ≥ 2009 with the following toolboxes: Bioinformatics Curve fitting Image processing Optimization Signal processing Statistics Note: The Statistics toolbox is the only toolbox required to simulate the model. The other toolboxes are needed to construct the Simulation object and run some of the model analysis. Download git client and install Install python-setuptools curl -O https://pypi.python.org/packages/2.7/s/setuptools/setuptools-0.6c11-py2.7.egg#md5=fe1f997bc722265116870bc7919059ea -o setuptools-0.6c11-py2.7.egg sudo sh setuptools-0.6c11-py2.7.egg --prefix=~ rm setuptools-0.6c11-py2.7.egg Install pip curl -O http://pypi.python.org/packages/source/p/pip/pip-1.3.1.tar.gz tar xzf pip-1.3.1.tar.gz cd pip-1.3.1 python setup.py install cd .. rm -rf pip-1.3.1 rm -rf pip-1.3.1.tar.gz Install python-magic sudo pip install python-magic Install s3cmd curl -O http://downloads.sourceforge.net/project/s3tools/s3cmd/1.5.0-alpha1/s3cmd-1.5.0-alpha1.tar.gz?r=&ts=1369441316&use_mirror=superb-dca2 -o s3cmd-1.5.0-alpha1.tar.gz (if this does not work try using wget instead of curl -O, or download directly by copy pasting the link in your browser) tar -xvvf s3cmd-1.5.0-alpha1.tar.gz cd s3cmd-1.5.0-alpha1 sudo python setup.py install cd .. rm -rf s3cmd-1.5.0-alpha1 rm s3cmd-1.5.0-alpha1.tar.gz Configure s3cmd by executing python s3cmd --configure s3://<your_bucket_name> Download your access and secret keys here Enter access and secret keys provided at registration (you will receive an email from bitmill-support@numerate.com, it might take some time) Leave "encryption password" and "Path to GPG program" blank Set "Use HTTPS protocol" to "Yes" Leave "HTTP Proxy server name" blank Yes, test access. This should result in a message "... Success" Yes, save settings Install bitmill-bash Visit https://github.com/Numerate/bitmill-bash Click "Clone in Mac" After repository downloads: cd ~/bitmill-bash rm bitmill.conf s3cmd get s3://<your_bucket_name>/bitmill.conf bitmill.conf #replace <your_bucket_name> with your bucket's name ./gen_all_scripts.sh Download whole-cell model code and install Visit https://github.com/CovertLab/WholeCell Click "Clone in Mac" After the repository downloads Click on the repository in the GitHub client Switch branch to parameter-estimation-DREAM-challenge-2013 Configure whole-cell software by following on screen instructions matlab >> cd /path/to/WholeCell matlab >> install(); Configure path Bash shells: Using an editor (Emacs or vi) append to ~/.bashprofile. Create file if necessary. export PATH=$PATH:~/bitmill-bash:~/bitmill-bash/dream csh shells: Using an editor append to ~/.cshrc. Create file if necessary. set PATH = ($PATH ~/bitmill-bash ~/bitmill-bash/dream) tsch shells: Using an editor append to ~/.tcshrc. Create file if necessary. set PATH = ($PATH ~/bitmill-bash ~/bitmill-bash/dream) 3.1.3 Windows Install MATLAB ≥ 2009 with the following toolboxes: Bioinformatics Curve fitting Image processing Optimization Signal processing Statistics Note: The Statistics toolbox is the only toolbox required to simulate the model. The other toolboxes are needed to construct the Simulation object and run some of the model analysis. Download git client and install Download Cygwin and install, following the on screen instructions. When prompted select the following packages below. Note: ignore error cygutils.sh exit code 127. Devel: git (1.7.9-1) Editors: nano (2.2.6-1) Net: curl (7.29.0-1) Python: python (2.7.3-1) Python: python-setuptools (0.6.34-1) Web: wget (1.13.4-1) Install s3cmd. Open Cygwin shell and execute: cd /cygdrive/c/Program Files/ wget http://downloads.sourceforge.net/project/s3tools/s3cmd/1.5.0-alpha1/s3cmd-1.5.0-alpha1.tar.gz?r=&ts=1369895321&use_mirror=superb-dca2 tar -xvvf s3cmd-1.5.0-alpha1.tar.gz?r= rm s3cmd-1.5.0-alpha1.tar.gz?r= cd /cygdrive/c/Program Files/s3cmd-1.5.0-alpha1/ python setup.py install Configure s3cmd. Open Cygwin shell and execute python s3cmd --configure s3://<your_bucket_name> Download your access and secret keys here Enter access and secret keys provided at registration (you will receive an email from bitmill-support@numerate.com, it might take some time) Leave "encryption password" and "Path to GPG program" blank Set "Use HTTPS protocol" to "Yes" Leave "HTTP Proxy server name" blank Yes, test access. This should result in a message "... Success" Yes, save settings Install bitmill-bash. Open Cygwin shell and execute: git clone https://github.com/Numerate/bitmill-bash.git ~/bitmill-bash cd ~/bitmill-bash rm bitmill.conf s3cmd get s3://<your_bucket_name>/bitmill.conf bitmill.conf #replace <your_bucket_name> with your bucket's name ./gen_all_scripts.sh Download whole-cell model code and install Visit https://github.com/CovertLab/WholeCell Click "Clone in Windows" After the repository downloads Click on the repository in the GitHub client Switch branch to parameter-estimation-DREAM-challenge-2013 Configure whole-cell software by following on screen instructions matlab >> cd c:\path\to\WholeCell matlab >> install(); Configure Windows environment variables Open control panel → System → Advanced system settings → Environment variables Select "PATH" in "System variables" section. Click "Edit...". Then append to the value ;c:\cygwin\bin Click "New..." Variable name = "CYGWIN" Variable value = "nodosfilewarning" Open export PATH=$PATH:~/bitmill-bash:~/bitmill-bash/dream 3.1.4 Whole-cell virtual machine Install VirtualBox Download, import, and run the whole-cell virtual machine. See instructions for more information. Configure s3cmd by executing python s3cmd --configure s3://<your_bucket_name> Download your access and secret keys here Enter access and secret keys provided at registration (you will receive an email from bitmill-support@numerate.com, it might take some time) Leave "encryption password" and "Path to GPG program" blank Set "Use HTTPS protocol" to "Yes" Leave "HTTP Proxy server name" blank Yes, test access. This should result in a message "... Success" Yes, save settings Configure bitmill-bash git pull cd ~/bitmill-bash rm bitmill.conf s3cmd get s3://<your_bucket_name>/bitmill.conf bitmill.conf #replace <your_bucket_name> with your bucket's name ./gen_all_scripts.sh Update whole-cell software cd ~/WholeCell git pull 3.2 Instantiating simulations and setting parameter valuesFollow the four steps below to instantiate the Simulation class and set the values of the model's parameters. Setup MATLAB warnings, path setWarnings(); setPath(); Instantiate the Simulation class with the base parameter values sim = edu.stanford.covert.cell.sim.util.CachedSimulationObjectUtil.load(); Optionally, set simulation options such as the simulation length (s) and the random number generator's seed. Get option values sim.getOptions(); Set option values sim.applyOptions('lengthSec', 10, 'seed', 1); Optionally, modify the model's parameter values. See Section 2.2 for more information about the model's parameters. Get current parameter values sim.getRnaPolTuBindingProbs(); sim.getRnaHalfLives(); sim.getMetabolicReactionKinetics(); Modify parameter values sim.applyRnaPolTuBindingProbs(struct(... 'TU_001', 0.0015, ... 'TU_002', 0.0025 ... )) sim.applyRnaHalfLives(struct(... 'TU_001', 146.9388, ... 'TU_002', 152.9412 ... )) sim.applyMetabolicReactionKinetics(struct(... 'AtpA', struct(... 'for', 1, ... 'rev', -1 ... ) ... )); Get a struct containing the values of all of the simulation's parameters parameterVals = sim.getAllParameters(); 3.3 Simulating the modelAfter instantiating the model and setting the desired option and parameter values the model can either be run locally on your own machine (using either MATLAB or the free MATLAB Component Runtime) or remotely on the cloud using BitMill provided by Numerate. All three methods will execute the same code and conduct the same in silico "experiments". See Section 2.4 for more information about the in silico "experimental" data. The in silico perturbation data available for "purchase" was generated using the same scripts outlined below. The only difference between the perturbation simulations and the simulations that participants will run is that the perturbation experiments used the true parameter values (and perturbations) chosen by the organizers which are unknown to the participants. Note: Simulations run on BitMill will be 65,000 s long (Simulation.lengthSec = 65000). 3.3.1 Simulating the model locally using MATLABExecute the following code to (1) simulate and measure individual in silico cells and (2) average the in silico "experimental" measurements over a population of individual cells. Note: parameterVals = sim.getAllParameters() is a struct created by the final step of the previous section (Section 3.2). Simulate and measure individual in silico cells simulateHighthroughputExperiments(... 'seed', 1, ... 'parameterVals', parameterVals, ... 'simPath', 'output/sim-1.mat' ... ); Calculate population averages averageHighthroughputExperiments(... 'simPathPattern', 'output/sim-*.mat' ... ); Note: Each simulation will require approximately 24-48 core-hours. 3.3.2 Simulating the model locally using the free MATLAB Component Runtime (MCR) Download and install MCR Compile code (preferred method) or download the 2012b MCR Linux binaries cd /path/to/WholeCell/ ./build.sh simulateHighthroughputExperiments ./build.sh averageHighthroughputExperiments Edit MCR shell script (bin/averageHighthroughputExperiments/run_averageHighthroughputExperiments.sh) to prevent globbing Add set -f after echo LD_LIBRARY_PATH is $\{LD_LIBRARY_PATH\}; Add set +f at end of file Simulate individual cells for $i in 1..2 bin/simulateHighthroughputExperiments/run_simulateHighthroughputExperiments.sh /path/to/runtime seed $i parameterValsPath /path/to/parameterValsPath.{mat|xml} simPath output/sim-$i.mat end An example XML file is available here. Average in silico experiments from multiple individual cells bin/averageHighthroughputExperiments/run_averageHighthroughputExperiments.sh /path/to/runtime simPathPattern 'output/sim-*.mat' avgValsPath output/sim-average.mat 3.3.3 Simulating the model remotely on the cloud using BitMillAfter registering (see Section 2.5) for the competition, participants will receive an email from bitmill-support@numerate.com with the login information for their BitMill account. After installing and configuring the bitmill-bash software (see Section 3.1), participants can use the commands below to submit candidate solutions to BitMill. These commands will trigger BitMill to execute the same code outlined in the previous section (3.3.2) in the cloud, and return to participants the same in silico "experiments" from a population of eight in silico cells. Users will receive an email from BitMill when the simulation results are available. Results will be stored in the participant's Amazon S3 bucket. The commands also enable participants to download the parameter and prediction distances between their parameter values and the true parameter values from their S3 bucket. Note: Participants will only be able to perform five in silico experiment at a time in the cloud. Because simulations take approximately 1-2 days, we anticipate that participants will be able to perform approximately 3 in silico experiments per week. Note: parameterVals = sim.getAllParameters() is a struct created by the final step of the previous section (Section 3.2). 3.3.3.1 Running simulations simName = '<choose a short simulation name>'; bucketUrl = 's3://<your bucket>'; [jobId, status, errMsg] = postCloudSimulation(... 'simName', simName, ... 'bucketUrl', bucketUrl, ... 'parameterVals', parameterVals ... ); Note: Simulations run on BitMill will be 65,000 s long (Simulation.lengthSec = 65000). 3.3.3.2 Checking simulation statuses getCloudSimulationStatus() getCloudSimulationStatus(jobId) 3.3.3.3 Canceling simulations cancelCloudSimulation(jobId) 3.3.3.4 Retrieving simulation results stored in Amazon S3 downloadCloudSimulationResults(... 'simName', simName, ... 'bucketUrl', bucketUrl, ... 'localFolder', 'output' ... ); This will download four files: <simName>.predictions.mat: Struct containing the average in silico experimental observations from a population of eight cells <simName>.distances.mat: Struct containing the distances from the gold-standard parameter values and predicted in silico experimental data <simName>.out: Concatenation of the standard output of the individual simulations <simName>.err: Concatenation of the standard error of the individual simulations 3.4 Simulating individual sub-models and subsets of sub-modelsIn addition to simulating the entire model, participants can simulate individual sub-models or groups of sub-models. Below we provide several illustrating examples. 3.4.1 Simulating the metabolic sub-model %set warnings and MATLAB path setWarnings(); setPath(); %import classes import edu.stanford.covert.cell.sim.util.CachedSimulationObjectUtil; %load simulation object sim = CachedSimulationObjectUtil.load(); %optionally, set simulation options sim.applyOptions('seed', 1); %optionally, set simulation parameter sim.applyMetabolicReactionKinetics(struct(... 'AtpA', struct(... 'for', 1, ... 'rev', -1 ... ) ... )); %get handle to metabolism sub-model met = sim.process('Metabolism'); %optionally, sample initial conditions sim.initializeState(); %simulate dynamics for 100s lengthSec = 100; for i = 1:lengthSec met.evolveState(); end See Section 3.1 for more information about how to set the values of the model's parameters. 3.4.2 Simulating the metabolism and transcription sub-models %import classes import edu.stanford.covert.cell.sim.util.CachedSimulationObjectUtil; %load simulation object sim = CachedSimulationObjectUtil.load(); % get state and sub-model handles time = sim.state('Time'); met = sim.process('Metabolism'); transcription = sim.process('Transcription'); %simulate lengthSec = 100; for i = 1:lengthSec time.values = i; met.copyFromState(); met.evolveState(); met.copyToState(); transcription.copyFromState(); transcription.evolveState(); transcription.copyToState(); end 3.4.3 Simulating the metabolism sub-model and logging predictions %import classes import edu.stanford.covert.cell.sim.util.CachedSimulationObjectUtil; %load simulation object sim = CachedSimulationObjectUtil.load(); %get handle to metabolism sub-model and metabolic reaction state met = sim.process('Metabolism'); mr = sim.state('MetabolicReaction'); %simulate dynamics for 100s lengthSec = 100; growth = zeros(lengthSec, 1); for i = 1:100 met.evolveState(); growth(i) = mr.growth; end 3.4.4 Simulating the metabolic sub-model and setting enzyme copy numbersNote: in general, protein copy numbers are proportional to the promoter affinity times the RNA half life. %set warnings and MATLAB path setWarnings(); setPath(); %import classes import edu.stanford.covert.cell.sim.util.CachedSimulationObjectUtil; %load simulation object sim = CachedSimulationObjectUtil.load(); %optionally, set simulation options sim.applyOptions('seed', 1); %optionally, set simulation parameter sim.applyMetabolicReactionKinetics(struct(... 'AtpA', struct(... 'for', 1, ... 'rev', -1 ... ) ... )); %get handle to metabolism sub-model met = sim.process('Metabolism'); %optionally, sample initial conditions sim.initializeState(); %set enzyme copy numbers (in general, protein copy numbers are proportional to the promoter affinity times the RNA half life) met.enzymes(strcmp(met.enzymeWholeCellModelIDs, 'MG_006_DIMER')) = 10; %simulate dynamics for 100s lengthSec = 100; for i = 1:lengthSec met.evolveState(); end 3.4.5 Simulating the metabolism sub-model and recording all predictions %import classes import edu.stanford.covert.cell.sim.util.CachedSimulationObjectUtil; import edu.stanford.covert.cell.sim.util.DiskLogger; %load simulation object sim = CachedSimulationObjectUtil.load(); %get handle to metabolism sub-model and metabolic reaction state time = sim.state('Time'); met = sim.process('Metabolism'); %set parameters sim.applyOptions('lengthSec', 100); %initialize sim.initializeState(); %initialize logger outPath = 'output/ht-data-test'; logFreqSec = 10; logger = DiskLogger(outPath, logFreqSec); logger.addMetadata(struct(... 'shortDescription', '', ... 'longDescription', '', ... 'email', '', ... 'firstName', '', ... 'lastName', '', ... 'affiliation', '', ... 'knowledgeBaseWID', '', ... 'revision', '', ... 'differencesFromRevision', '', ... 'userName', '', ... 'hostName', '', ... 'ipAddress', '' ... )); logger.initialize(sim); %simulate dynamics for t = 1:sim.lengthSec %set time time.values = t; %calculate metabolism met.evolveState(); %log predictions logger.append(sim); end %finalize logger logger.finalize(sim); 3.4.6 Simulating the metabolism sub-model and recording high-throughput in silico data %import classes import edu.stanford.covert.cell.sim.util.CachedSimulationObjectUtil; import edu.stanford.covert.cell.sim.util.HighthroughputExperimentsLogger; %load simulation object sim = CachedSimulationObjectUtil.load(); %get handle to metabolism sub-model and metabolic reaction state time = sim.state('Time'); met = sim.process('Metabolism'); %set parameters sim.applyOptions('lengthSec', 100); %initialize sim.initializeState(); %initialize logger outPath = 'output/ht-data-test.mat'; logger = HighthroughputExperimentsLogger(outPath); logger.initialize(sim); %simulate dynamics for t = 1:sim.lengthSec %set time time.values = t; %calculate metabolism met.evolveState(); %log predictions logger.append(sim); end %finalize logger logger.finalize(sim); 3.5 Further informationThe whole-cell model is described extensively in Data S1 of Karr et al., 2012. The following references provide additional information about the whole-cell model: Read the model user guide Browse the model documentations doxygen m2html View a table listing the model's parameters View a table listing the metabolic reactions View the flux-balance analysis (FBA) metabolic model in several linear programming formats which can be evaluated with programs such as the free lpsolve Cplex .lpt Gurobi .lp Lindo .ltx Lindo .lp LPFML .xml MathProg .mod SBML .sbml Xpress .lpx ZIMPL .zpl View a table listing the model's states including their types and sizes View a table listing the row and column labels of model's states Browse or download the M. genitalium whole-cell knowledge base, WholeCellKB Browse the model frequently asked questions Browse the challenge forum Still have a question? Please post questions to the challenge organizers and other participants via the forum. 4 Questions? Comments?First, please browse the forum. Still have a question? Please post questions to the challenge organizers and other participants via the forum. 5 CreditsThe challenge was conceived by Markus Covert, Jonathan Karr, Pablo Meyer, and Gustavo Stolovitzky. Christian Basile, Po-Ru Loh, and Alejandro Villaverde provided valuable feedback on the challenge design. Jonathan Karr modified the model and provided the simulated data for the challenge. Christian Basile, Jonathan Karr, Kahn Rhrissorrakrai and Pablo Meyer tested the model. Brandon Allgood, Jonathan Karr, Mike Kellen, Pablo Meyer, Simon Wilkinson, and Jessen Yu implemented the computational infrastructure provided to the participants. Brian Bot, Jonathan Karr, and Pablo Meyer developed the credit system. Jonathan Karr and Pablo Meyer developed the scoring methodology and the leader board and curated the challenge. The computational infrastructure was provided free of charge to participants by Numerate and Sage Bionetworks. The organizers thank the following individuals for their help organizing the competition: Brandon Allgood, Numerate Christian Basile, Urban Green Energy Brian Bot, Sage Bionetworks Deepak Chandran, Autodesk Thomas Cokelaer, EMBL-EBI Markus Covert, Stanford University Bruce Hoff, Sage Bionetworks Jay Hodgson, Sage Bionetworks Jonathan Karr, Stanford University Mike Kellen, Sage Bionetworks Po-Ru Loh, MIT Thea Norman, Sage Bionetworks Kahn Rhrissorrakrai, IBM Pablo Meyer, IBM Julio Saez-Rodriguez, EMBL-EBI Gustavo Stolovitzky, IBM Alejandro Villaverde, Consejo Superior de Investigaciones Cientificas Simon Wilkinson, Numerate Jessen Yu, Numerate 6 References Karr et al. (2012) A Whole-cell computational model predicts phenotype from genotype. Cell, 150, 389?401. PubMed Karr et al. (2013) WholeCellKB: model organism databases for comprehensive whole-cell models. Nucleic Acids Res, 41, D787?92. PubMed 7 Links & downloads Challenge Webinar slides and video Sign up page .m file containing MATLAB code in the above challenge description Initial "experimental" data "Experimental" data "purchase" form Solution submission instructions (See Section 2.6) Write-up submission page Leader board Forum Whole-cell model Description Metabolic reactions table FBA Metabolic model in linear programming formats Cplex .lpt Gurobi .lp Lindo .ltx Lindo .lp LPFML .xml MathProg .mod SBML .sbml Xpress .lpx ZIMPL .zpl Table of all model parameters State properties table State property row and column IDs table Code User guide Documentation: doxygen, m2html Whole-cell virtual machine and instructions 2012b MCR Linux binaries WholeCellKB: M. genitalium knowledge base WholeCellViz: Visualization software Frequently asked questions Software BitMill cloud computing service Cygwin Git: Mac, Linux, Windows Linux Mint lpsolve MATLAB documentation MATLAB Component Runtime (MCR) s3cmd VirtualBox

syn1876070
syn1896430
syn1899798
syn1899801
syn1899802
syn1899803
syn1899814
syn1899839
syn1901526
syn1902669
syn1917699
syn2177087
syn2177239
syn2375194
syn1876281
syn1899778
syn1899779
syn1899780
syn1899781
syn1899782
syn1899783
syn1899784
syn1899785
syn1899786
syn1899788
syn1899790
syn1899791
syn1899792
syn1899793
syn1899794
syn1899795
syn1899796
syn1899797
syn1899799
syn1899800
syn1899804
syn1899805
syn1899806
syn1899807
syn1899808
syn1899809
syn1899810
syn1899811
syn1899812
syn1899813
syn1901045
syn1901046
syn1901047
syn1901048
syn1901049
syn1901050
syn1901051
syn1901052
syn1901053
syn1901054
syn1901055
syn1901056
syn1901057
syn1901058
syn1901059
syn1901060
syn1901061
syn1901062
syn1901063
syn1901064
syn1901065
syn1901066
syn1901067
syn1901068
syn1901069
syn1901070
syn1901071
syn1901072
syn1901073
syn1901074
syn1901075
syn1901076
syn1901077
syn1901078
syn1901079
syn1901080
syn1901081
syn1901082
syn1901083
syn1901084
syn1901085
syn1901086
syn1901087
syn1901088
syn1901089
syn1901090
syn1901091
syn1901092
syn1901093
syn1901094
syn1901095
syn1901096
syn1901097
syn1901098
syn1901099
syn1901100
syn1901101
syn1901102
syn1901103
syn1901104
syn1901105
syn1901106
syn1901107
syn1901108
syn1901109
syn1901110
syn1901111
syn1901112
syn1901113
syn1901114
syn1901115
syn1901116
syn1901117
syn1901118
syn1901119
syn1901120
syn1901121
syn1901122
syn1901123
syn1901124
syn1901125
syn1901126
syn1901127
syn1901128
syn1901129
syn1901130
syn1901131
syn1901132
syn1901133
syn1901134
syn1901135
syn1901136
syn1901137
syn1901138
syn1901139
syn1901140
syn1901141
syn1901142
syn1901143
syn1901144
syn1901145
syn1901146
syn1901147
syn1901148
syn1901149
syn1901150
syn1901151
syn1901152
syn1901153
syn1901154
syn1901155
syn1901156
syn1901157
syn1901158
syn1901159
syn1901160
syn1901161
syn1901162
syn1901163
syn1901164
syn1901165
syn1901166
syn1901167
syn1901168
syn1901169
syn1901170
syn1901171
syn1901172
syn1901173
syn1901174
syn1901175
syn1901176
syn1901177
syn1901178
syn1901179
syn1901180
syn1901181
syn1901182
syn1901183
syn1901184
syn1901185
syn1901186
syn1901187
syn1901188
syn1901189
syn1901190
syn1901191
syn1901192
syn1901193
syn1901194
syn1901195
syn1901196
syn1901197
syn1901198
syn1901199
syn1901200
syn1901201
syn1901202
syn1901203
syn1901204
syn1901205
syn1901206
syn1901207
syn1901208
syn1901209
syn1901210
syn1901211
syn1901212
syn1901213
syn1901214
syn1901215
syn1901216
syn1901217
syn1901218
syn1901219
syn1901220
syn1901221
syn1901222
syn1901223
syn1901224
syn1901225
syn1901226
syn1901227
syn1901228
syn1901229
syn1901230
syn1901231
syn1901232
syn1901233
syn1901234
syn1901235
syn1901236
syn1901237
syn1901238
syn1901239
syn1901240
syn1901241
syn1901242
syn1901243
syn1901244
syn1901245
syn1901246
syn1901247
syn1901248
syn1901249
syn1901250
syn1901251
syn1901252
syn1901253
syn1901254
syn1901255
syn1901256
syn1901257
syn1901258
syn1901259
syn1901260
syn1901261
syn1901262
syn1901263
syn1901264
syn1901265
syn1901266
syn1901267
syn1901268
syn1901269
syn1901270
syn1901271
syn1901272
syn1901273
syn1901274
syn1901275
syn1901276
syn1901277
syn1901278
syn1901279
syn1901280
syn1901281
syn1901282
syn1901283
syn1901284
syn1901285
syn1901286
syn1901287
syn1901288
syn1901289
syn1901290
syn1901291
syn1901292
syn1901293
syn1901294
syn1901295
syn1901296
syn1901297
syn1901298
syn1901299
syn1901300
syn1901301
syn1901302
syn1901303
syn1901304
syn1901305
syn1901306
syn1901307
syn1901308
syn1901309
syn1901310
syn1901311
syn1901312
syn1901313
syn1901314
syn1901315
syn1901316
syn1901317
syn1901318
syn1901319
syn1901320
syn1901321
syn1901322
syn1901323
syn1901324
syn1901325
syn1901326
syn1901327
syn1901328
syn1901329
syn1901330
syn1901331
syn1901332
syn1901333
syn1901334
syn1901335
syn1901336
syn1901337
syn1901338
syn1901339
syn1901340
syn1901341
syn1901342
syn1901343
syn1901344
syn1901345
syn1901346
syn1901347
syn1901348
syn1901349
syn1901350
syn1901351
syn1901352
syn1901353
syn1901354
syn1901355
syn1901356
syn1901357
syn1901358
syn1901359
syn1901360
syn1901361
syn1901362
syn1901363
syn1901364
syn1901365
syn1901366
syn1901367
syn1901368
syn1901369
syn1901370
syn1901371
syn1901372
syn1901373
syn1901374
syn1901375
syn1901376
syn1901377
syn1901378
syn1901379
syn1901380
syn1901381
syn1901382
syn1901383
syn1901384
syn1901385
syn1901386
syn1901387
syn1901388
syn1901389
syn1901390
syn1901391
syn1901392
syn1901393
syn1901394
syn1901395
syn1901396
syn1901397
syn1901398
syn1901399
syn1901400
syn1901401
syn1901402
syn1901403
syn1901404
syn1901405
syn1901406
syn1901407
syn1901408
syn1901409
syn1901410
syn1901411
syn1901412
syn1901413
syn1901414
syn1901415
syn1901416
syn1901417
syn1901418
syn1901419
syn1901420
syn1901421
syn1901422
syn1901423
syn1901424
syn1901425
syn1901426
syn1901427
syn1901428
syn1901429
syn1901430
syn1901431
syn1901432
syn1901433
syn1901434
syn1901435
syn1901436
syn1901437
syn1901438
syn1901439
syn1901440
syn1901441
syn1901442
syn1901443
syn1901444
syn1901445
syn1901446
syn1901447
syn1901448
syn1901449
syn1901450
syn1901451
syn1901452
syn1901453
syn1901454
syn1901455
syn1901456
syn1901457
syn1901458
syn1901459
syn1901460
syn1901461
syn1901462
syn1901463
syn1901464
syn1901465
syn1901466
syn1901467
syn1901468
syn1901469
syn1901470
syn1901471
syn1901472
syn1901473
syn1901474
syn1901475
syn1901476
syn1901477
syn1901478
syn1901479
syn1901480
syn1901481
syn1901482
syn1901483
syn1901484
syn1901485
syn1901486
syn1901487
syn1901488
syn1901489
syn1901490
syn1901491
syn1901492
syn1901493
syn1901494
syn1901495
syn1901496
syn1901497
syn1901498
syn1901499
syn1901500
syn1901501
syn1901502
syn1901503
syn1901504
syn1901505
syn1901506
syn1901507
syn1901508
syn1901509
syn1901510
syn1901511
syn1901512
syn1901513
syn1901514
syn1901515
syn1901516
syn1901517
syn1901518
syn1901519
syn1901520
syn1901521
syn1901522
syn1901523
syn1901524
syn1929234
syn1947640
syn1947643
syn1967634
syn1968218
syn1968258
syn1970199
syn1976548
syn1977659
syn1998826
syn2018393
syn2018396
syn2018400
syn2018414
syn2024364
syn2025077
syn2028965
syn2157854
syn2158885
syn2159388
syn2175989
syn2176274
syn2177088
syn2177089
syn2177800
syn2177911
syn2177962
syn2178011
syn2178167
syn2179856
syn2185130
syn2185241
syn2186277
syn2188367
syn2192503
syn2192883
syn2197571
syn1967713
syn1968203
syn1968259
syn1968260
syn1968262
syn1968263
syn1968264
syn1968265
syn1968266
syn1970681
syn1970683
syn1977660
syn1977662
syn1985859
syn1985860
syn1985861
syn1985862
syn1985863
syn1985864
syn1985865
syn1998847
syn1999344
syn1999345
syn1999346
syn1999347
syn2015189
syn2015190
syn2015191
syn2015192
syn2015193
syn2015194
syn2015195
syn2018285
syn2018286
syn2018287
syn2018293
syn2018296
syn2018297
syn2018298
syn2018299
syn2018300
syn2018301
syn2018302
syn2018303
syn2018304
syn2018416
syn2018417
syn2018418
syn2018419
syn2018420
syn2018421
syn2018422
syn2018423
syn2018424
syn2018425
syn2018426
syn2018427
syn2018428
syn2018429
syn2018430
syn2018431
syn2018432
syn2018433
syn2018434
syn2018435
syn2159368
syn2159369
syn2159370
syn2159371
syn2159372
syn2159373
syn2159374
syn2159375
syn2159376
syn2159377
syn2159378
syn2159379
syn2159380
syn2159381
syn2159382
syn2159383
syn2159384
syn2159389
syn2159390
syn2159391
syn2159392
syn2159393
syn2159394
syn2159454
syn2159455
syn2159456
syn2159457
syn2159458
syn2159459
syn2159460
syn2159461
syn2159462
syn2159463
syn2159464
syn2159466
syn2159467
syn2159468
syn2159469
syn2159470
syn2159471
syn2159473
syn2159474
syn2159476
syn2159477
syn2159478
syn2159479
syn2159480
syn2159481
syn2174876
syn2175823
syn2175824
syn2175825
syn2175826
syn2175941
syn2175942
syn2175943
syn2175944
syn2175945
syn2175946
syn2175947
syn2175948
syn2175949
syn2175950
syn2175951
syn2175952
syn2175953
syn2175954
syn2175955
syn2175956
syn2175957
syn2175958
syn2175959
syn2175960
syn2175961
syn2175962
syn2175963
syn2175964
syn2175965
syn2175966
syn2175967
syn2175968
syn2175969
syn2175970
syn2175971
syn2175972
syn2175973
syn2175974
syn2175975
syn2175976
syn2175977
syn2175978
syn2175979
syn2175980
syn2175981
syn2175982
syn2175983
syn2175984
syn2175985
syn2175986
syn2175987
syn2175990
syn2175991
syn2175992
syn2175993
syn2175994
syn2175995
syn2175996
syn2175997
syn2175998
syn2175999
syn2176000
syn2176001
syn2176002
syn2176003
syn2176004
syn2176005
syn2176006
syn2176007
syn2176008
syn2176009
syn2176010
syn2176011
syn2176012
syn2176013
syn2176014
syn2176015
syn2176016
syn2176017
syn2176018
syn2176019
syn2176020
syn2176021
syn2176022
syn2176023
syn2176024
syn2176025
syn2176026
syn2176027
syn2176028
syn2176029
syn2176030
syn2176031
syn2176032
syn2176033
syn2176034
syn2176035
syn2176036
syn2176037
syn2176106
syn2176107
syn2176108
syn2176109
syn2176134
syn2176135
syn2176136
syn2176137
syn2176138
syn2176139
syn2176140
syn2176141
syn2176142
syn2176143
syn2176144
syn2176145
syn2176146
syn2176147
syn2176148
syn2176149
syn2176150
syn2176151
syn2176152
syn2176153
syn2176154
syn2176155
syn2176156
syn2176157
syn2176158
syn2176159
syn2176160
syn2176161
syn2176162
syn2176163
syn2176164
syn2176165
syn2176166
syn2176167
syn2176168
syn2176169
syn2176170
syn2176171
syn2176172
syn2176173
syn2176174
syn2176275
syn2176276
syn2176277
syn2176278
syn2176279
syn2176280
syn2176281
syn2176282
syn2176283
syn2176284
syn2176285
syn2176286
syn2176287
syn2176288
syn2176289
syn2176290
syn2176291
syn2176292
syn2176293
syn2176294
syn2176295
syn2176296
syn2176297
syn2176298
syn2176299
syn2176300
syn2176301
syn2176302
syn2176303
syn2176304
syn2176305
syn2176306
syn2176307
syn2176308
syn2176309
syn2176310
syn2176311
syn2176312
syn2176313
syn2176314
syn2176315
syn2176316
syn2176317
syn2176318
syn2176319
syn2176320
syn2176321
syn2176322
syn2177912
syn2177913
syn2177914
syn2177915
syn2177916
syn2177917
syn2177918
syn2177919
syn2177920
syn2177921
syn2177922
syn2177923
syn2177924
syn2177925
syn2177926
syn2177927
syn2177928
syn2177929
syn2177930
syn2177931
syn2177932
syn2177933
syn2177934
syn2177935
syn2177936
syn2177937
syn2177938
syn2177939
syn2177940
syn2177941
syn2177942
syn2177943
syn2177944
syn2177945
syn2177946
syn2177947
syn2177948
syn2177949
syn2177950
syn2177951
syn2177952
syn2177953
syn2177955
syn2177956
syn2177957
syn2177958
syn2177959
syn2177960
syn2177963
syn2177964
syn2177965
syn2177966
syn2177967
syn2177968
syn2177969
syn2177970
syn2177971
syn2177972
syn2177973
syn2177974
syn2177975
syn2177976
syn2177977
syn2177978
syn2177979
syn2177980
syn2177981
syn2177982
syn2177983
syn2177984
syn2177985
syn2177986
syn2177987
syn2177988
syn2177989
syn2177990
syn2177991
syn2177992
syn2177993
syn2177994
syn2177995
syn2177996
syn2177997
syn2177998
syn2177999
syn2178000
syn2178001
syn2178002
syn2178003
syn2178004
syn2178005
syn2178006
syn2178007
syn2178008
syn2178009
syn2178010
syn2178012
syn2178013
syn2178014
syn2178015
syn2178016
syn2178017
syn2178018
syn2178019
syn2178020
syn2178021
syn2178022
syn2178023
syn2178024
syn2178025
syn2178026
syn2178027
syn2178028
syn2178029
syn2178030
syn2178031
syn2178032
syn2178033
syn2178034
syn2178035
syn2178036
syn2178037
syn2178039
syn2178040
syn2178041
syn2178042
syn2178043
syn2178044
syn2178045
syn2178046
syn2178048
syn2178049