STDERR: 2019-10-30T13:08:29.831862620Z WARNING:toil.leader:7/F/jobPoJ7tW main(args) STDERR: 2019-10-30T13:08:29.831961180Z WARNING:toil.leader:7/F/jobPoJ7tW File "runDocker.py", line 140, in main STDERR: 2019-10-30T13:08:29.831970800Z WARNING:toil.leader:7/F/jobPoJ7tW subprocess.check_call(tar_command) STDERR: 2019-10-30T13:08:29.832068871Z WARNING:toil.leader:7/F/jobPoJ7tW File "/usr/local/lib/python2.7/subprocess.py", line 190, in check_call STDERR: 2019-10-30T13:08:29.832077121Z WARNING:toil.leader:7/F/jobPoJ7tW raise CalledProcessError(retcode, cmd) STDERR: 2019-10-30T13:08:29.832216551Z WARNING:toil.leader:7/F/jobPoJ7tW subprocess.CalledProcessError: Command '['tar', '-C', '/var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/tmpapjgSA/3/4/out_tmpdirkgLK0J/model', '--remove-files.', '-cvzf', 'model_files.tar.gz']' returned non-zero exit status 64 STDERR: 2019-10-30T13:08:29.832233671Z WARNING:toil.leader:7/F/jobPoJ7tW [job run_synthetic_training_docker.cwl] Max memory used: 48MiB STDERR: 2019-10-30T13:08:29.832287001Z WARNING:toil.leader:7/F/jobPoJ7tW INFO:cwltool:[job run_synthetic_training_docker.cwl] Max memory used: 48MiB STDERR: 2019-10-30T13:08:29.832341751Z WARNING:toil.leader:7/F/jobPoJ7tW [job run_synthetic_training_docker.cwl] Job error: STDERR: 2019-10-30T13:08:29.832407181Z WARNING:toil.leader:7/F/jobPoJ7tW Error collecting output for parameter 'model': STDERR: 2019-10-30T13:08:29.832457022Z WARNING:toil.leader:7/F/jobPoJ7tW :1:1: Did not find output file with glob pattern: '['model_files.tar.gz']' STDERR: 2019-10-30T13:08:29.832562573Z WARNING:toil.leader:7/F/jobPoJ7tW ERROR:cwltool:[job run_synthetic_training_docker.cwl] Job error: STDERR: 2019-10-30T13:08:29.832574942Z WARNING:toil.leader:7/F/jobPoJ7tW Error collecting output for parameter 'model': STDERR: 2019-10-30T13:08:29.832678783Z WARNING:toil.leader:7/F/jobPoJ7tW :1:1: Did not find output file with glob pattern: '['model_files.tar.gz']' STDERR: 2019-10-30T13:08:29.832686073Z WARNING:toil.leader:7/F/jobPoJ7tW [job run_synthetic_training_docker.cwl] completed permanentFail STDERR: 2019-10-30T13:08:29.832767433Z WARNING:toil.leader:7/F/jobPoJ7tW WARNING:cwltool:[job run_synthetic_training_docker.cwl] completed permanentFail STDERR: 2019-10-30T13:08:29.832813953Z WARNING:toil.leader:7/F/jobPoJ7tW Traceback (most recent call last): STDERR: 2019-10-30T13:08:29.832906383Z WARNING:toil.leader:7/F/jobPoJ7tW File "/usr/local/lib/python2.7/site-packages/toil/worker.py", line 331, in workerScript STDERR: 2019-10-30T13:08:29.832916733Z WARNING:toil.leader:7/F/jobPoJ7tW job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore) STDERR: 2019-10-30T13:08:29.833027054Z WARNING:toil.leader:7/F/jobPoJ7tW File "/usr/local/lib/python2.7/site-packages/toil/job.py", line 1378, in _runner STDERR: 2019-10-30T13:08:29.833033654Z WARNING:toil.leader:7/F/jobPoJ7tW returnValues = self._run(jobGraph, fileStore) STDERR: 2019-10-30T13:08:29.833147714Z WARNING:toil.leader:7/F/jobPoJ7tW File "/usr/local/lib/python2.7/site-packages/toil/job.py", line 1323, in _run STDERR: 2019-10-30T13:08:29.833157995Z WARNING:toil.leader:7/F/jobPoJ7tW return self.run(fileStore) STDERR: 2019-10-30T13:08:29.833255814Z WARNING:toil.leader:7/F/jobPoJ7tW File "/usr/local/lib/python2.7/site-packages/toil/cwl/cwltoil.py", line 606, in run STDERR: 2019-10-30T13:08:29.833263714Z WARNING:toil.leader:7/F/jobPoJ7tW raise cwltool.errors.WorkflowException(status) STDERR: 2019-10-30T13:08:29.833363405Z WARNING:toil.leader:7/F/jobPoJ7tW WorkflowException: permanentFail STDERR: 2019-10-30T13:08:29.833373236Z WARNING:toil.leader:7/F/jobPoJ7tW ERROR:toil.worker:Exiting the worker because of a failed job on host da9f97c2dc10 STDERR: 2019-10-30T13:08:29.833478395Z WARNING:toil.leader:7/F/jobPoJ7tW WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'file:///var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/EHR-challenge-develop/run_synthetic_training_docker.cwl' python 7/F/jobPoJ7tW with ID 7/F/jobPoJ7tW to 0 STDERR: 2019-10-30T13:08:29.835113421Z WARNING:toil.leader:Job 'file:///var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/EHR-challenge-develop/run_synthetic_training_docker.cwl' python 7/F/jobPoJ7tW with ID 7/F/jobPoJ7tW is completely failed STDERR: 2019-10-30T13:08:39.431184843Z INFO:toil.leader:Finished toil run with 6 failed jobs. STDERR: 2019-10-30T13:08:39.431708226Z INFO:toil.leader:Failed jobs at end of the run: 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/get_submission_docker.cwl' python q/N/jobyZKH3S 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/validate_docker.cwl' python j/t/jobGZDB12 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/get_docker_config.cwl' python z/7/jobmkQZcE 'CWLWorkflow' 8/l/jobPnP2G8 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/download_from_synapse.cwl' python w/U/jobhUqF1x 'file:///var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/EHR-challenge-develop/run_synthetic_training_docker.cwl' python 7/F/jobPoJ7tW STDERR: 2019-10-30T13:08:39.501627802Z Traceback (most recent call last): STDERR: 2019-10-30T13:08:39.501656111Z File "/usr/local/bin/toil-cwl-runner", line 8, in STDERR: 2019-10-30T13:08:39.501663921Z sys.exit(main()) STDERR: 2019-10-30T13:08:39.501670421Z File "/usr/local/lib/python2.7/site-packages/toil/cwl/cwltoil.py", line 1276, in main STDERR: 2019-10-30T13:08:39.501741481Z outobj = toil.start(wf1) STDERR: 2019-10-30T13:08:39.501754672Z File "/usr/local/lib/python2.7/site-packages/toil/common.py", line 781, in start STDERR: 2019-10-30T13:08:39.501942392Z return self._runMainLoop(rootJobGraph) STDERR: 2019-10-30T13:08:39.501952743Z File "/usr/local/lib/python2.7/site-packages/toil/common.py", line 1054, in _runMainLoop STDERR: 2019-10-30T13:08:39.502211493Z jobCache=self._jobCache).run() STDERR: 2019-10-30T13:08:39.502222314Z File "/usr/local/lib/python2.7/site-packages/toil/leader.py", line 246, in run STDERR: 2019-10-30T13:08:39.502227953Z raise FailedJobsException(self.config.jobStore, self.toilState.totalFailedJobs, self.jobStore) STDERR: 2019-10-30T13:08:39.502263474Z toil.leader.FailedJobsException Thanks Hezhe Qiao " />

Hi?My model works fine locally, but after submitting it, there will be some problems. I thought it was a model problem, then I used the baseline model, and the same problem occurred. Your workflow job, (submission ID 9694460), has failed to complete. The message is: -packages/toil/leader.py", line 246, in run STDERR: 2019-10-30T09:53:26.328473059Z raise FailedJobsException(self.config.jobStore, self.toilState.totalFailedJobs, self.jobStore) STDERR: 2019-10-30T09:53:26.328487000Z toil.leader.FailedJobsException Your logs are available here: https://www.synapse.org/#!Synapse:syn21074810. log.txt : STDERR: 2019-10-30T13:02:45.955485181Z INFO:cwltool:Resolved '/var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/EHR-challenge-develop/main_docker_agent_workflow.cwl' to 'file:///var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/EHR-challenge-develop/main_docker_agent_workflow.cwl' STDERR: 2019-10-30T13:02:48.101650675Z EHR-challenge-develop/main_docker_agent_workflow.cwl:143:9: object id `EHR-challenge-develop/main_docker_agent_workflow.cwl#run_docker_infer/status` previously defined STDERR: 2019-10-30T13:02:48.101689975Z WARNING:salad:EHR-challenge-develop/main_docker_agent_workflow.cwl:143:9: object id `EHR-challenge-develop/main_docker_agent_workflow.cwl#run_docker_infer/status` previously defined STDERR: 2019-10-30T13:02:48.103350921Z EHR-challenge-develop/main_docker_agent_workflow.cwl:117:9: object id `EHR-challenge-develop/main_docker_agent_workflow.cwl#run_docker_train/status` previously defined STDERR: 2019-10-30T13:02:48.103365491Z WARNING:salad:EHR-challenge-develop/main_docker_agent_workflow.cwl:117:9: object id `EHR-challenge-develop/main_docker_agent_workflow.cwl#run_docker_train/status` previously defined STDERR: 2019-10-30T13:02:52.235303139Z EHR-challenge-develop/run_synthetic_infer_docker.cwl:22:5: object id `EHR-challenge-develop/run_synthetic_infer_docker.cwl#status` previously defined STDERR: 2019-10-30T13:02:52.235334879Z WARNING:salad:EHR-challenge-develop/run_synthetic_infer_docker.cwl:22:5: object id `EHR-challenge-develop/run_synthetic_infer_docker.cwl#status` previously defined STDERR: 2019-10-30T13:02:52.475376349Z EHR-challenge-develop/run_synthetic_training_docker.cwl:22:5: object id `EHR-challenge-develop/run_synthetic_training_docker.cwl#status` previously defined STDERR: 2019-10-30T13:02:52.475390679Z WARNING:salad:EHR-challenge-develop/run_synthetic_training_docker.cwl:22:5: object id `EHR-challenge-develop/run_synthetic_training_docker.cwl#status` previously defined STDERR: 2019-10-30T13:02:53.615748598Z WARNING:toil.batchSystems.singleMachine:Limiting maxCores to CPU count of system (32). STDERR: 2019-10-30T13:02:53.615778958Z WARNING:toil.batchSystems.singleMachine:Limiting maxMemory to physically available memory (268108005376). STDERR: 2019-10-30T13:02:53.615903629Z WARNING:toil.batchSystems.singleMachine:Limiting maxDisk to physically available disk (270534606848). STDERR: 2019-10-30T13:02:53.842902163Z INFO:toil:Running Toil version 3.20.0-cf34ca3416697f2abc816b2538f20ee29ba16932. STDERR: 2019-10-30T13:02:54.136910002Z DEBUG:toil.jobStores.fileJobStore:Path to job store directory is '/var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/tmpapjgSA'. STDERR: 2019-10-30T13:02:54.138110377Z INFO:toil.worker:Redirecting logging to /var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/toil-6eaa2009-daa9-43ba-9032-c1f2aabe0edc-3bdead962f93fd7fa17dcb3c0b3ee830/tmpkH1NAX/worker_log.txt STDERR: 2019-10-30T13:02:54.971819553Z INFO:toil.leader:Issued job 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/notification_email.cwl' python N/y/jobOf38bF with job batch system ID: 1 and cores: 1, disk: 11.0 G, and memory: 100.0 M STDERR: 2019-10-30T13:02:54.980382424Z INFO:toil.leader:Issued job 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/get_submission_docker.cwl' python q/N/jobyZKH3S with job batch system ID: 2 and cores: 1, disk: 11.0 G, and memory: 100.0 M STDERR: 2019-10-30T13:02:54.981206276Z INFO:toil.leader:Issued job 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/download_from_synapse.cwl' python w/U/jobhUqF1x with job batch system ID: 3 and cores: 1, disk: 11.0 G, and memory: 100.0 M STDERR: 2019-10-30T13:02:54.982248100Z INFO:toil.leader:Issued job 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/get_docker_config.cwl' python z/7/jobmkQZcE with job batch system ID: 4 and cores: 1, disk: 11.0 G, and memory: 100.0 M STDERR: 2019-10-30T13:02:55.288274812Z DEBUG:toil.jobStores.fileJobStore:Path to job store directory is '/var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/tmpapjgSA'. STDERR: 2019-10-30T13:02:55.289611196Z INFO:toil.worker:Redirecting logging to /var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/toil-6eaa2009-daa9-43ba-9032-c1f2aabe0edc-3bdead962f93fd7fa17dcb3c0b3ee830/tmp6ID2I8/worker_log.txt STDERR: 2019-10-30T13:02:55.296603011Z DEBUG:toil.jobStores.fileJobStore:Path to job store directory is '/var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/tmpapjgSA'. STDERR: 2019-10-30T13:02:55.297633234Z INFO:toil.worker:Redirecting logging to /var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/toil-6eaa2009-daa9-43ba-9032-c1f2aabe0edc-3bdead962f93fd7fa17dcb3c0b3ee830/tmpwNDp2x/worker_log.txt STDERR: 2019-10-30T13:02:55.302829332Z DEBUG:toil.jobStores.fileJobStore:Path to job store directory is '/var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/tmpapjgSA'. STDERR: 2019-10-30T13:02:55.304012417Z INFO:toil.worker:Redirecting logging to /var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/toil-6eaa2009-daa9-43ba-9032-c1f2aabe0edc-3bdead962f93fd7fa17dcb3c0b3ee830/tmp2wtYgQ/worker_log.txt STDERR: 2019-10-30T13:02:55.314174502Z DEBUG:toil.jobStores.fileJobStore:Path to job store directory is '/var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/tmpapjgSA'. STDERR: 2019-10-30T13:02:55.314898864Z INFO:toil.worker:Redirecting logging to /var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/toil-6eaa2009-daa9-43ba-9032-c1f2aabe0edc-3bdead962f93fd7fa17dcb3c0b3ee830/tmp0lex1K/worker_log.txt STDERR: 2019-10-30T13:02:56.962340682Z INFO:toil.leader:Job ended successfully: 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/download_from_synapse.cwl' python w/U/jobhUqF1x STDERR: 2019-10-30T13:02:57.678736610Z INFO:toil.leader:Job ended successfully: 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/get_docker_config.cwl' python z/7/jobmkQZcE STDERR: 2019-10-30T13:02:57.943672788Z INFO:toil.leader:Job ended successfully: 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/get_submission_docker.cwl' python q/N/jobyZKH3S STDERR: 2019-10-30T13:02:57.944258839Z INFO:toil.leader:Issued job 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/validate_docker.cwl' python j/t/jobGZDB12 with job batch system ID: 5 and cores: 1, disk: 11.0 G, and memory: 100.0 M STDERR: 2019-10-30T13:02:57.981256869Z INFO:toil.leader:Job ended successfully: 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/notification_email.cwl' python N/y/jobOf38bF STDERR: 2019-10-30T13:02:58.239313113Z DEBUG:toil.jobStores.fileJobStore:Path to job store directory is '/var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/tmpapjgSA'. STDERR: 2019-10-30T13:02:58.240439597Z INFO:toil.worker:Redirecting logging to /var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/toil-6eaa2009-daa9-43ba-9032-c1f2aabe0edc-3bdead962f93fd7fa17dcb3c0b3ee830/tmpXO79F6/worker_log.txt STDERR: 2019-10-30T13:03:00.797088883Z INFO:toil.leader:Job ended successfully: 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/validate_docker.cwl' python j/t/jobGZDB12 STDERR: 2019-10-30T13:03:00.797879686Z INFO:toil.leader:Issued job 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/annotate_submission.cwl' python Y/f/jobK6mUTt with job batch system ID: 6 and cores: 1, disk: 11.0 G, and memory: 100.0 M STDERR: 2019-10-30T13:03:00.805536432Z INFO:toil.leader:Issued job 'file:///var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/EHR-challenge-develop/run_synthetic_training_docker.cwl' python 7/F/jobPoJ7tW with job batch system ID: 7 and cores: 1, disk: 11.0 G, and memory: 100.0 M STDERR: 2019-10-30T13:03:01.096898124Z DEBUG:toil.jobStores.fileJobStore:Path to job store directory is '/var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/tmpapjgSA'. STDERR: 2019-10-30T13:03:01.097983147Z INFO:toil.worker:Redirecting logging to /var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/toil-6eaa2009-daa9-43ba-9032-c1f2aabe0edc-3bdead962f93fd7fa17dcb3c0b3ee830/tmpvo1qHO/worker_log.txt STDERR: 2019-10-30T13:03:01.105255103Z DEBUG:toil.jobStores.fileJobStore:Path to job store directory is '/var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/tmpapjgSA'. STDERR: 2019-10-30T13:03:01.106359176Z INFO:toil.worker:Redirecting logging to /var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/toil-6eaa2009-daa9-43ba-9032-c1f2aabe0edc-3bdead962f93fd7fa17dcb3c0b3ee830/tmpKIk6Lz/worker_log.txt STDERR: 2019-10-30T13:03:03.822525298Z INFO:toil.leader:Job ended successfully: 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/annotate_submission.cwl' python Y/f/jobK6mUTt STDERR: 2019-10-30T13:08:29.828185316Z INFO:toil.leader:Job ended successfully: 'file:///var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/EHR-challenge-develop/run_synthetic_training_docker.cwl' python 7/F/jobPoJ7tW STDERR: 2019-10-30T13:08:29.828651318Z WARNING:toil.leader:The job seems to have left a log file, indicating failure: 'file:///var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/EHR-challenge-develop/run_synthetic_training_docker.cwl' python 7/F/jobPoJ7tW STDERR: 2019-10-30T13:08:29.828680928Z WARNING:toil.leader:7/F/jobPoJ7tW INFO:toil.worker:---TOIL WORKER OUTPUT LOG--- STDERR: 2019-10-30T13:08:29.828819259Z WARNING:toil.leader:7/F/jobPoJ7tW INFO:toil:Running Toil version 3.20.0-cf34ca3416697f2abc816b2538f20ee29ba16932. STDERR: 2019-10-30T13:08:29.828831428Z WARNING:toil.leader:7/F/jobPoJ7tW [job run_synthetic_training_docker.cwl] /var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/tmpapjgSA/3/4/out_tmpdirkgLK0J$ python \ STDERR: 2019-10-30T13:08:29.828949198Z WARNING:toil.leader:7/F/jobPoJ7tW runDocker.py \ STDERR: 2019-10-30T13:08:29.828968439Z WARNING:toil.leader:7/F/jobPoJ7tW -s \ STDERR: 2019-10-30T13:08:29.829064199Z WARNING:toil.leader:7/F/jobPoJ7tW 9694471 \ STDERR: 2019-10-30T13:08:29.829075110Z WARNING:toil.leader:7/F/jobPoJ7tW -p \ STDERR: 2019-10-30T13:08:29.829186180Z WARNING:toil.leader:7/F/jobPoJ7tW docker.synapse.org/syn21068802/baseline_svm_model \ STDERR: 2019-10-30T13:08:29.829193650Z WARNING:toil.leader:7/F/jobPoJ7tW -d \ STDERR: 2019-10-30T13:08:29.829289951Z WARNING:toil.leader:7/F/jobPoJ7tW sha256:bd14cac22d53fc50c402a9b27d1867b9d029f6645395a7d8d87c6512f4e48a6d \ STDERR: 2019-10-30T13:08:29.829300470Z WARNING:toil.leader:7/F/jobPoJ7tW --status \ STDERR: 2019-10-30T13:08:29.829393980Z WARNING:toil.leader:7/F/jobPoJ7tW VALIDATED \ STDERR: 2019-10-30T13:08:29.829400281Z WARNING:toil.leader:7/F/jobPoJ7tW --parentid \ STDERR: 2019-10-30T13:08:29.829533851Z WARNING:toil.leader:7/F/jobPoJ7tW syn21075229 \ STDERR: 2019-10-30T13:08:29.829541651Z WARNING:toil.leader:7/F/jobPoJ7tW -c \ STDERR: 2019-10-30T13:08:29.829651121Z WARNING:toil.leader:7/F/jobPoJ7tW /var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/tmpbDzzkx/stgd0610422-4515-4e2a-85d9-51da2b59a93f/.synapseConfig \ STDERR: 2019-10-30T13:08:29.829662622Z WARNING:toil.leader:7/F/jobPoJ7tW -i \ STDERR: 2019-10-30T13:08:29.829701692Z WARNING:toil.leader:7/F/jobPoJ7tW /home/thomasyu/train STDERR: 2019-10-30T13:08:29.829764352Z WARNING:toil.leader:7/F/jobPoJ7tW INFO:cwltool:[job run_synthetic_training_docker.cwl] /var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/tmpapjgSA/3/4/out_tmpdirkgLK0J$ python \ STDERR: 2019-10-30T13:08:29.829848412Z WARNING:toil.leader:7/F/jobPoJ7tW runDocker.py \ STDERR: 2019-10-30T13:08:29.829858142Z WARNING:toil.leader:7/F/jobPoJ7tW -s \ STDERR: 2019-10-30T13:08:29.829927882Z WARNING:toil.leader:7/F/jobPoJ7tW 9694471 \ STDERR: 2019-10-30T13:08:29.829984843Z WARNING:toil.leader:7/F/jobPoJ7tW -p \ STDERR: 2019-10-30T13:08:29.830045752Z WARNING:toil.leader:7/F/jobPoJ7tW docker.synapse.org/syn21068802/baseline_svm_model \ STDERR: 2019-10-30T13:08:29.830115193Z WARNING:toil.leader:7/F/jobPoJ7tW -d \ STDERR: 2019-10-30T13:08:29.830160653Z WARNING:toil.leader:7/F/jobPoJ7tW sha256:bd14cac22d53fc50c402a9b27d1867b9d029f6645395a7d8d87c6512f4e48a6d \ STDERR: 2019-10-30T13:08:29.830230563Z WARNING:toil.leader:7/F/jobPoJ7tW --status \ STDERR: 2019-10-30T13:08:29.830239164Z WARNING:toil.leader:7/F/jobPoJ7tW VALIDATED \ STDERR: 2019-10-30T13:08:29.830324994Z WARNING:toil.leader:7/F/jobPoJ7tW --parentid \ STDERR: 2019-10-30T13:08:29.830354204Z WARNING:toil.leader:7/F/jobPoJ7tW syn21075229 \ STDERR: 2019-10-30T13:08:29.830428784Z WARNING:toil.leader:7/F/jobPoJ7tW -c \ STDERR: 2019-10-30T13:08:29.830476174Z WARNING:toil.leader:7/F/jobPoJ7tW /var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/tmpbDzzkx/stgd0610422-4515-4e2a-85d9-51da2b59a93f/.synapseConfig \ STDERR: 2019-10-30T13:08:29.830509415Z WARNING:toil.leader:7/F/jobPoJ7tW -i \ STDERR: 2019-10-30T13:08:29.830567795Z WARNING:toil.leader:7/F/jobPoJ7tW /home/thomasyu/train STDERR: 2019-10-30T13:08:29.830620385Z WARNING:toil.leader:7/F/jobPoJ7tW Welcome, ehrdreamservice! STDERR: 2019-10-30T13:08:29.830709345Z WARNING:toil.leader:7/F/jobPoJ7tW STDERR: 2019-10-30T13:08:29.830717706Z WARNING:toil.leader:7/F/jobPoJ7tW root STDERR: 2019-10-30T13:08:29.830814656Z WARNING:toil.leader:7/F/jobPoJ7tW mounting volumes STDERR: 2019-10-30T13:08:29.830824335Z WARNING:toil.leader:7/F/jobPoJ7tW checking for containers STDERR: 2019-10-30T13:08:29.830920656Z WARNING:toil.leader:7/F/jobPoJ7tW running container STDERR: 2019-10-30T13:08:29.830928716Z WARNING:toil.leader:7/F/jobPoJ7tW creating logfile STDERR: 2019-10-30T13:08:29.831025806Z WARNING:toil.leader:7/F/jobPoJ7tW STDERR: 2019-10-30T13:08:29.831035407Z WARNING:toil.leader:7/F/jobPoJ7tW ################################################## STDERR: 2019-10-30T13:08:29.831136317Z WARNING:toil.leader:7/F/jobPoJ7tW Uploading file to Synapse storage STDERR: 2019-10-30T13:08:29.831144597Z WARNING:toil.leader:7/F/jobPoJ7tW ################################################## STDERR: 2019-10-30T13:08:29.831314038Z WARNING:toil.leader:7/F/jobPoJ7tW STDERR: 2019-10-30T13:08:29.831325557Z WARNING:toil.leader:7/F/jobPoJ7tW STDERR: 2019-10-30T13:08:29.831421938Z WARNING:toil.leader:7/F/jobPoJ7tW ################################################## STDERR: 2019-10-30T13:08:29.831431248Z WARNING:toil.leader:7/F/jobPoJ7tW Uploading file to Synapse storage STDERR: 2019-10-30T13:08:29.831527839Z WARNING:toil.leader:7/F/jobPoJ7tW ################################################## STDERR: 2019-10-30T13:08:29.831537139Z WARNING:toil.leader:7/F/jobPoJ7tW STDERR: 2019-10-30T13:08:29.831637749Z WARNING:toil.leader:7/F/jobPoJ7tW tar: unrecognized option '--remove-files.' STDERR: 2019-10-30T13:08:29.831646159Z WARNING:toil.leader:7/F/jobPoJ7tW Try 'tar --help' or 'tar --usage' for more information. STDERR: 2019-10-30T13:08:29.831744210Z WARNING:toil.leader:7/F/jobPoJ7tW finished training STDERR: 2019-10-30T13:08:29.831753199Z WARNING:toil.leader:7/F/jobPoJ7tW Traceback (most recent call last): STDERR: 2019-10-30T13:08:29.831853760Z WARNING:toil.leader:7/F/jobPoJ7tW File "runDocker.py", line 179, in STDERR: 2019-10-30T13:08:29.831862620Z WARNING:toil.leader:7/F/jobPoJ7tW main(args) STDERR: 2019-10-30T13:08:29.831961180Z WARNING:toil.leader:7/F/jobPoJ7tW File "runDocker.py", line 140, in main STDERR: 2019-10-30T13:08:29.831970800Z WARNING:toil.leader:7/F/jobPoJ7tW subprocess.check_call(tar_command) STDERR: 2019-10-30T13:08:29.832068871Z WARNING:toil.leader:7/F/jobPoJ7tW File "/usr/local/lib/python2.7/subprocess.py", line 190, in check_call STDERR: 2019-10-30T13:08:29.832077121Z WARNING:toil.leader:7/F/jobPoJ7tW raise CalledProcessError(retcode, cmd) STDERR: 2019-10-30T13:08:29.832216551Z WARNING:toil.leader:7/F/jobPoJ7tW subprocess.CalledProcessError: Command '['tar', '-C', '/var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/tmpapjgSA/3/4/out_tmpdirkgLK0J/model', '--remove-files.', '-cvzf', 'model_files.tar.gz']' returned non-zero exit status 64 STDERR: 2019-10-30T13:08:29.832233671Z WARNING:toil.leader:7/F/jobPoJ7tW [job run_synthetic_training_docker.cwl] Max memory used: 48MiB STDERR: 2019-10-30T13:08:29.832287001Z WARNING:toil.leader:7/F/jobPoJ7tW INFO:cwltool:[job run_synthetic_training_docker.cwl] Max memory used: 48MiB STDERR: 2019-10-30T13:08:29.832341751Z WARNING:toil.leader:7/F/jobPoJ7tW [job run_synthetic_training_docker.cwl] Job error: STDERR: 2019-10-30T13:08:29.832407181Z WARNING:toil.leader:7/F/jobPoJ7tW Error collecting output for parameter 'model': STDERR: 2019-10-30T13:08:29.832457022Z WARNING:toil.leader:7/F/jobPoJ7tW :1:1: Did not find output file with glob pattern: '['model_files.tar.gz']' STDERR: 2019-10-30T13:08:29.832562573Z WARNING:toil.leader:7/F/jobPoJ7tW ERROR:cwltool:[job run_synthetic_training_docker.cwl] Job error: STDERR: 2019-10-30T13:08:29.832574942Z WARNING:toil.leader:7/F/jobPoJ7tW Error collecting output for parameter 'model': STDERR: 2019-10-30T13:08:29.832678783Z WARNING:toil.leader:7/F/jobPoJ7tW :1:1: Did not find output file with glob pattern: '['model_files.tar.gz']' STDERR: 2019-10-30T13:08:29.832686073Z WARNING:toil.leader:7/F/jobPoJ7tW [job run_synthetic_training_docker.cwl] completed permanentFail STDERR: 2019-10-30T13:08:29.832767433Z WARNING:toil.leader:7/F/jobPoJ7tW WARNING:cwltool:[job run_synthetic_training_docker.cwl] completed permanentFail STDERR: 2019-10-30T13:08:29.832813953Z WARNING:toil.leader:7/F/jobPoJ7tW Traceback (most recent call last): STDERR: 2019-10-30T13:08:29.832906383Z WARNING:toil.leader:7/F/jobPoJ7tW File "/usr/local/lib/python2.7/site-packages/toil/worker.py", line 331, in workerScript STDERR: 2019-10-30T13:08:29.832916733Z WARNING:toil.leader:7/F/jobPoJ7tW job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore) STDERR: 2019-10-30T13:08:29.833027054Z WARNING:toil.leader:7/F/jobPoJ7tW File "/usr/local/lib/python2.7/site-packages/toil/job.py", line 1378, in _runner STDERR: 2019-10-30T13:08:29.833033654Z WARNING:toil.leader:7/F/jobPoJ7tW returnValues = self._run(jobGraph, fileStore) STDERR: 2019-10-30T13:08:29.833147714Z WARNING:toil.leader:7/F/jobPoJ7tW File "/usr/local/lib/python2.7/site-packages/toil/job.py", line 1323, in _run STDERR: 2019-10-30T13:08:29.833157995Z WARNING:toil.leader:7/F/jobPoJ7tW return self.run(fileStore) STDERR: 2019-10-30T13:08:29.833255814Z WARNING:toil.leader:7/F/jobPoJ7tW File "/usr/local/lib/python2.7/site-packages/toil/cwl/cwltoil.py", line 606, in run STDERR: 2019-10-30T13:08:29.833263714Z WARNING:toil.leader:7/F/jobPoJ7tW raise cwltool.errors.WorkflowException(status) STDERR: 2019-10-30T13:08:29.833363405Z WARNING:toil.leader:7/F/jobPoJ7tW WorkflowException: permanentFail STDERR: 2019-10-30T13:08:29.833373236Z WARNING:toil.leader:7/F/jobPoJ7tW ERROR:toil.worker:Exiting the worker because of a failed job on host da9f97c2dc10 STDERR: 2019-10-30T13:08:29.833478395Z WARNING:toil.leader:7/F/jobPoJ7tW WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'file:///var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/EHR-challenge-develop/run_synthetic_training_docker.cwl' python 7/F/jobPoJ7tW with ID 7/F/jobPoJ7tW to 0 STDERR: 2019-10-30T13:08:29.835113421Z WARNING:toil.leader:Job 'file:///var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/EHR-challenge-develop/run_synthetic_training_docker.cwl' python 7/F/jobPoJ7tW with ID 7/F/jobPoJ7tW is completely failed STDERR: 2019-10-30T13:08:39.431184843Z INFO:toil.leader:Finished toil run with 6 failed jobs. STDERR: 2019-10-30T13:08:39.431708226Z INFO:toil.leader:Failed jobs at end of the run: 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/get_submission_docker.cwl' python q/N/jobyZKH3S 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/validate_docker.cwl' python j/t/jobGZDB12 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/get_docker_config.cwl' python z/7/jobmkQZcE 'CWLWorkflow' 8/l/jobPnP2G8 'https://raw.githubusercontent.com/Sage-Bionetworks/ChallengeWorkflowTemplates/v1.6/download_from_synapse.cwl' python w/U/jobhUqF1x 'file:///var/lib/docker/volumes/workflow_orchestrator_shared/_data/69111817-7b12-4aca-a611-b691e3b8b092/EHR-challenge-develop/run_synthetic_training_docker.cwl' python 7/F/jobPoJ7tW STDERR: 2019-10-30T13:08:39.501627802Z Traceback (most recent call last): STDERR: 2019-10-30T13:08:39.501656111Z File "/usr/local/bin/toil-cwl-runner", line 8, in STDERR: 2019-10-30T13:08:39.501663921Z sys.exit(main()) STDERR: 2019-10-30T13:08:39.501670421Z File "/usr/local/lib/python2.7/site-packages/toil/cwl/cwltoil.py", line 1276, in main STDERR: 2019-10-30T13:08:39.501741481Z outobj = toil.start(wf1) STDERR: 2019-10-30T13:08:39.501754672Z File "/usr/local/lib/python2.7/site-packages/toil/common.py", line 781, in start STDERR: 2019-10-30T13:08:39.501942392Z return self._runMainLoop(rootJobGraph) STDERR: 2019-10-30T13:08:39.501952743Z File "/usr/local/lib/python2.7/site-packages/toil/common.py", line 1054, in _runMainLoop STDERR: 2019-10-30T13:08:39.502211493Z jobCache=self._jobCache).run() STDERR: 2019-10-30T13:08:39.502222314Z File "/usr/local/lib/python2.7/site-packages/toil/leader.py", line 246, in run STDERR: 2019-10-30T13:08:39.502227953Z raise FailedJobsException(self.config.jobStore, self.toilState.totalFailedJobs, self.jobStore) STDERR: 2019-10-30T13:08:39.502263474Z toil.leader.FailedJobsException Thanks Hezhe Qiao

Created by Qiao Hezhe QiaoHezhe
I believe we've solved this issue. We re-ran the three models that @avati, @ivanbrugere, and @QiaoHezhe submitted, so if those models failed they should've failed due to errors in the scripts, not errors in our pipeline. You can check your logs for more info. Thanks, Tim
We see the problem, Hang tight, we're working it out right now Tim
I'm having a similar issue as well. The overall job fails. Training log has completed successfully. Inference log isn't created at all.
I had a very similar log last evening on a docker images that runs fine locally. For reference my logs: https://www.synapse.org/#!Synapse:syn21074728 The training script seems to finish but the inference log isn't created. There may be some pipeline issue between training/test?

Model submission page is loading…