First, with latest synapser release.
Next, with latest python client, installed via pip. Trace shown below.
Before this complete failure, the "Downloading ..." message looped between 0 and 2% complete, restarting at 0 several times.
Python 3.7.1 on ubuntu, good fast connection to the internet (South Lake Union, Seattle).
Rumor has it that these sorts of synapse download failures are not uncommon.
- Paul Shannon
206.658.3789
>>> x = syn.get('syn11714133', downloadLocation=".")
Downloading [--------------------]1.72% 264.0MB/14.9GB (891.9kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_19.recalibrated_variants.vcf.gz.synapse_download_22409246 Traceback (most recent call$
File "/users/pshannon/anaconda3/lib/python3.7/site-packages/urllib3/response.py", line 360, in _error_catcher
yield
File "/users/pshannon/anaconda3/lib/python3.7/site-packages/urllib3/response.py", line 442, in read
data = self._fp.read(amt)
File "/users/pshannon/anaconda3/lib/python3.7/http/client.py", line 447, in read
n = self.readinto(b)
File "/users/pshannon/anaconda3/lib/python3.7/http/client.py", line 491, in readinto
n = self.fp.readinto(b)
File "/users/pshannon/anaconda3/lib/python3.7/socket.py", line 589, in readinto
return self._sock.recv_into(b)
File "/users/pshannon/anaconda3/lib/python3.7/ssl.py", line 1052, in recv_into
return self.read(nbytes, buffer)
File "/users/pshannon/anaconda3/lib/python3.7/ssl.py", line 911, in read
return self._sslobj.read(len, buffer)
ConnectionResetError: [Errno 104] Connection reset by peer
Created by Paul Shannon paul-shannon using docker.synapse.org/syn25326461/synpy-1128, downloading large vcf files is, alas, **still** unreliable:
```
>>> x = syn.get("syn11714079", downloadLocation="/tmp")
x = syn.get("syn11714079", downloadLocation="/tmp")
Downloading [####################]99.98% 41.9GB/41.9GB (47.9MB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_3.recalibrated_variants.vcf.gz.synapse_download_74048101
```
This has been frozen for 30 minutes at 99.98% complete. There is plenty of disk space on the receiving end. @jordank
Great it works now! Thanks a lot Jordan!!
Best,
Jiali @jzhuang_denovo
Test version 2.3.1.368 actually does not include the change, as it is a later test build of 2.3.1 which ultimately did not include this in its final release (it was included preliminarily in some earlier 2.3.1 builds including the latest that was available as of 3/23 but is now scheduled for 2.4 instead.
You can try a build that does include this change by installing an earlier 2.3.1 build that includes it, e.g. 2.3.1.326 (using the == qualifier). There is not yet a 2.4 test build.
```
pip3 install --upgrade --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple "synapseclient==2.3.1.326"
```
If you could try this and see if it resolves the issue for you.
Thanks. I encountered very similar situation as Paul's. And using version 2.3.1.368 using Jordan's code doesn't resolve it. Below are the error messages:
[jzhuang@localhost AMP]$ synapse --debug get syn10507730
Welcome, Jiali Zhuang!
2021-04-19 16:42:18,797 [client:434 - INFO]: Welcome, Jiali Zhuang!
Downloading [--------------------]1.96% 8.0MB/407.9MB (931.4kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downloDownloading [#-------------------]3.92% 16.0MB/407.9MB (966.9kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#-------------------]5.88% 24.0MB/407.9MB (974.7kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [##------------------]7.85% 32.0MB/407.9MB (964.3kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [##------------------]9.81% 40.0MB/407.9MB (978.9kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [##------------------]11.77% 48.0MB/407.9MB (988.6kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downDownloading [###-----------------]13.73% 56.0MB/407.9MB (1001.1kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_dowDownloading [###-----------------]15.69% 64.0MB/407.9MB (1010.6kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_dowDownloading [####----------------]17.62% 71.9MB/407.9MB (1013.1kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_dowDownloading [####----------------]19.59% 79.9MB/407.9MB (1015.1kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_dowDownloading [####----------------]21.55% 87.9MB/407.9MB (1018.1kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_dowDownloading [#####---------------]23.51% 95.9MB/407.9MB (1019.9kB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_dowDownloading [#####---------------]25.47% 103.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#####---------------]27.43% 111.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [######--------------]29.39% 119.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [######--------------]31.35% 127.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#######-------------]33.31% 135.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#######-------------]35.28% 143.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#######-------------]37.24% 151.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [########------------]39.20% 159.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [########------------]41.16% 167.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#########-----------]43.12% 175.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#########-----------]45.08% 183.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#########-----------]47.04% 191.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [##########----------]49.01% 199.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [##########----------]50.97% 207.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [###########---------]52.93% 215.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [###########---------]54.89% 223.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [###########---------]56.85% 231.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [############--------]58.81% 239.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [############--------]60.77% 247.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#############-------]62.73% 255.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#############-------]64.70% 263.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [#############-------]66.66% 271.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [##############------]68.62% 279.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [##############------]70.58% 287.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [###############-----]72.54% 295.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [###############-----]74.50% 303.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_downlDownloading [###############-----]76.46% 311.9MB/407.9MB (1.0MB/s) MSSM_all_quant_counts-matrix.txt.gz.synapse_download_73673569 2021-04-19 16:47:27,810 [client:1868 - DEBUG]:
Retrying download on error: [] after progressing 0 bytes
Traceback (most recent call last):
File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/urllib3/response.py", line 331, in _error_catcher
yield
File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/urllib3/response.py", line 413, in read
data = self._fp.read(amt)
File "/opt/biomarker/anaconda3/lib/python3.7/http/client.py", line 447, in read
n = self.readinto(b)
File "/opt/biomarker/anaconda3/lib/python3.7/http/client.py", line 491, in readinto
n = self.fp.readinto(b)
File "/opt/biomarker/anaconda3/lib/python3.7/socket.py", line 589, in readinto
return self._sock.recv_into(b)
File "/opt/biomarker/anaconda3/lib/python3.7/ssl.py", line 1049, in recv_into
return self.read(nbytes, buffer)
File "/opt/biomarker/anaconda3/lib/python3.7/ssl.py", line 908, in read
return self._sslobj.read(len, buffer)
ConnectionResetError: [Errno 104] Connection reset by peer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/requests/models.py", line 753, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/urllib3/response.py", line 465, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/urllib3/response.py", line 430, in read
raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
File "/opt/biomarker/anaconda3/lib/python3.7/contextlib.py", line 130, in __exit__
self.gen.throw(type, value, traceback)
File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/urllib3/response.py", line 349, in _error_catcher
raise ProtocolError('Connection broken: %r' % e, e)
urllib3.exceptions.ProtocolError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/synapseclient/client.py", line 1851, in _downloadFileHandle
expected_md5=fileHandle.get('contentMd5'))
File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/synapseclient/client.py", line 1892, in _download_from_url_multi_threaded
multithread_download.download_file(self, request)
File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/synapseclient/core/multithread_download/download_threads.py", line 232, in download_file
downloader.download_file(download_request)
File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/synapseclient/core/multithread_download/download_threads.py", line 297, in download_file
self._write_chunks(request, completed_futures, transfer_status)
File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/synapseclient/core/multithread_download/download_threads.py", line 372, in _write_chunks
chunk_data = chunk_response.content
File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/requests/models.py", line 831, in content
self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
File "/opt/biomarker/anaconda3/lib/python3.7/site-packages/requests/models.py", line 756, in generate
raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer')) @paul-shannon
Not yet. A resolution for this will be in the 2.4 version release. There is a patch release (2.3.1) to be released shortly but as I do as yet have an explanation for your most last stack (the md5 mismatch error) encountered using this fix I wasn't able to include it in that release. Hi @jordank ,
Has your fix - as seen in the docker image synpy-1128 made it into the standard synapse docker image? So that I can switch back to that?
Thank you.
- Paul @jordank Another result, very close, but last minute failure:
I tried the new docker image ```docker.synapse.org/syn25326461/synpy-1128``` from my home, and my laptop. Though very slow (< 2M/sec) it seemed to run robustly. However, there seems to have been one restart, and then a failure - apparently when comparing MD5s on the dowloaded file.
Full trace below.
```
x = syn.get('syn11714133', downloadLocation="/tmp")
Downloading [##------------------]12.18% 1.8GB/14.9GB (1.5MB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_19.recalibrated_variants.vcf.gz.synapse_download_74048232
Downloading [####################]99.97% 14.9GB/14.9GB (1.7MB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_19.recalibrated_variants.vcf.gz.synapse_download_74048232 Traceback (most recent call last):
File "", line 1, in
File "/synapsePythonClient/synapseclient/client.py", line 713, in get
return self._getWithEntityBundle(entityBundle=bundle, entity=entity, **kwargs)
File "/synapsePythonClient/synapseclient/client.py", line 829, in _getWithEntityBundle
self._download_file_entity(downloadLocation, entity, ifcollision, submission)
File "/synapsePythonClient/synapseclient/client.py", line 891, in _download_file_entity
downloadPath = self._downloadFileHandle(entity.dataFileHandleId, objectId, objectType, downloadPath)
File "/synapsePythonClient/synapseclient/client.py", line 1840, in _downloadFileHandle
expected_md5=fileHandle.get('contentMd5'))
File "/synapsePythonClient/synapseclient/client.py", line 1895, in _download_from_url_multi_threaded
filename=temp_destination, md5=actual_md5, expected_md5=expected_md5
synapseclient.core.exceptions.SynapseMd5MismatchError: Downloaded file /tmp/NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_19.recalibrated_variants.vcf.gz.synapse_download_74048232's md5 789f9902c131511ef417139c6de7d78f does not match expected MD5 of d2193b7fd96e3feb02566c578de3a26c
```
Good progress, @jordank.
The new client runs about 10-15x faster on our linux server: 45M/sec (new) vs 3-6M/sec (old). And completes the download without error.
On my laptop, from home, it runs at 1.5M/s - and 14% complete, has not yet failed and restarted.
@abby.vanderlinden - how long did it take for you to download 14.9G? I ran this on my local machine and the download completed with no issues. Thanks, Jordan! @paul-shannon
I've uploaded the image to the Synapse docker registry. You can download and run the image as follows:
```
# login to synapse docker registry
docker login -u docker.synapse.org
Password:
# pull the image
docker pull docker.synapse.org/syn25326461/synpy-1128
# run the image in an interactive shell
docker run -ti docker.synapse.org/syn25326461/synpy-1128 /bin/bash
```
Thanks! @jordank Would it be too much to ask you to create a new docker image with these changes? That would be a easier for me, ensure version conflicts and uncertainties do not interfere.
Thanks @abby.vanderlinden .
Interestingly Abby's error indicates a different exact cause (that error seems to be typically associated with an SSL protocol mismatch), whereas the error from Paul's stack suggests one of the download concurrent download connections timing out.
I can't directly reproduce or fully explain either of them individually, but seeing two different intermittent underlying causes resulting in an error at the same spot in the code has me thinking about addressing this in a different way. I think that a change in the following test version of Synapse client will make the download more robust to an unexpected error in one of the individual file part downloads. This test version is available from our test.pypi release and can be installed e.g.
```
pip3 install --upgrade --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple "synapseclient>=2.3.1.316"
```
@abby.vanderlinden and/or @paul-shannon Paul, could I impose on you to install the above version and reattempt the download (again, I apologize that I'm unable to reliably reproduce this myself to confirm ahead of time it will fix the issue).
Thanks,
Jordan Ok, here's the debug output from my download attempt:
synapse --debug get syn11714177
Downloading [--------------------]0.45% 8.0MB/1.7GB (526.3kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01Downloading [--------------------]0.89% 16.0MB/1.7GB (512.1kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [--------------------]1.34% 24.0MB/1.7GB (517.7kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [--------------------]1.79% 32.0MB/1.7GB (549.9kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [--------------------]2.23% 40.0MB/1.7GB (555.0kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [#-------------------]2.68% 48.0MB/1.7GB (573.4kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [#-------------------]3.13% 56.0MB/1.7GB (572.4kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [#-------------------]3.57% 64.0MB/1.7GB (568.7kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [#-------------------]4.02% 72.0MB/1.7GB (581.5kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [#-------------------]4.47% 80.0MB/1.7GB (586.3kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [#-------------------]4.91% 88.0MB/1.7GB (599.1kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [#-------------------]5.36% 96.0MB/1.7GB (585.3kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis0Downloading [#-------------------]5.81% 104.0MB/1.7GB (583.9kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [#-------------------]6.25% 112.0MB/1.7GB (572.0kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [#-------------------]6.70% 120.0MB/1.7GB (580.8kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [#-------------------]7.15% 128.0MB/1.7GB (580.7kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [##------------------]7.59% 136.0MB/1.7GB (584.9kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [##------------------]8.04% 144.0MB/1.7GB (599.7kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [##------------------]8.49% 152.0MB/1.7GB (604.4kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [##------------------]8.93% 160.0MB/1.7GB (610.8kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [##------------------]9.38% 168.0MB/1.7GB (613.8kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [##------------------]9.83% 176.0MB/1.7GB (619.0kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysisDownloading [##------------------]10.27% 184.0MB/1.7GB (618.0kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [##------------------]10.72% 192.0MB/1.7GB (621.4kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [##------------------]11.17% 200.0MB/1.7GB (624.5kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [##------------------]11.61% 208.0MB/1.7GB (618.6kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [##------------------]12.06% 216.0MB/1.7GB (622.3kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]12.51% 224.0MB/1.7GB (620.8kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]12.95% 232.0MB/1.7GB (613.6kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]13.40% 240.0MB/1.7GB (607.7kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]13.85% 248.0MB/1.7GB (604.4kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]14.29% 256.0MB/1.7GB (605.7kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]14.74% 264.0MB/1.7GB (608.6kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]15.19% 272.0MB/1.7GB (609.0kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]15.63% 280.0MB/1.7GB (610.0kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]16.08% 288.0MB/1.7GB (607.6kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]16.53% 296.0MB/1.7GB (605.6kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]16.97% 304.0MB/1.7GB (601.8kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [###-----------------]17.42% 312.0MB/1.7GB (600.7kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [####----------------]17.87% 320.0MB/1.7GB (602.1kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [####----------------]18.31% 328.0MB/1.7GB (606.8kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [####----------------]18.76% 336.0MB/1.7GB (604.5kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [####----------------]19.21% 344.0MB/1.7GB (601.6kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [####----------------]19.65% 352.0MB/1.7GB (601.0kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [####----------------]20.10% 360.0MB/1.7GB (602.3kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [####----------------]20.55% 368.0MB/1.7GB (603.1kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysiDownloading [####----------------]20.99% 376.0MB/1.7GB (604.5kB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_Y.recalibrated_variants.vcf(1).gz.synapse_download_74048637 2021-03-23 10:41:46,677 [client:1856 - DEBUG]:
Retrying download on error: [] after progressing 0 bytes
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/urllib3/response.py", line 438, in _error_catcher
yield
File "/usr/local/lib/python3.9/site-packages/urllib3/response.py", line 519, in read
data = self._fp.read(amt) if not fp_closed else b""
File "/usr/local/Cellar/python@3.9/3.9.2_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 458, in read
n = self.readinto(b)
File "/usr/local/Cellar/python@3.9/3.9.2_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 502, in readinto
n = self.fp.readinto(b)
File "/usr/local/Cellar/python@3.9/3.9.2_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/socket.py", line 704, in readinto
return self._sock.recv_into(b)
File "/usr/local/Cellar/python@3.9/3.9.2_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py", line 1241, in recv_into
return self.read(nbytes, buffer)
File "/usr/local/Cellar/python@3.9/3.9.2_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py", line 1099, in read
return self._sslobj.read(len, buffer)
ConnectionResetError: [Errno 54] Connection reset by peer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 753, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "/usr/local/lib/python3.9/site-packages/urllib3/response.py", line 576, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "/usr/local/lib/python3.9/site-packages/urllib3/response.py", line 541, in read
raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
File "/usr/local/Cellar/python@3.9/3.9.2_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/contextlib.py", line 135, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.9/site-packages/urllib3/response.py", line 455, in _error_catcher
raise ProtocolError("Connection broken: %r" % e, e)
urllib3.exceptions.ProtocolError: ("Connection broken: ConnectionResetError(54, 'Connection reset by peer')", ConnectionResetError(54, 'Connection reset by peer'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/synapseclient/client.py", line 1836, in _downloadFileHandle
downloaded_path = self._download_from_url_multi_threaded(fileHandleId,
File "/usr/local/lib/python3.9/site-packages/synapseclient/client.py", line 1881, in _download_from_url_multi_threaded
multithread_download.download_file(self, request)
File "/usr/local/lib/python3.9/site-packages/synapseclient/core/multithread_download/download_threads.py", line 232, in download_file
downloader.download_file(download_request)
File "/usr/local/lib/python3.9/site-packages/synapseclient/core/multithread_download/download_threads.py", line 297, in download_file
self._write_chunks(request, completed_futures, transfer_status)
File "/usr/local/lib/python3.9/site-packages/synapseclient/core/multithread_download/download_threads.py", line 372, in _write_chunks
chunk_data = chunk_response.content
File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 831, in content
self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 756, in generate
raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ("Connection broken: ConnectionResetError(54, 'Connection reset by peer')", ConnectionResetError(54, 'Connection reset by peer')) Hey Paul and Jordan, I did see the restart error downloading one of the vcfs from this study using the command line client (latest version, Python 3.9.2). I will say that my internet connection where I am right now is not great, which may be part of it on my end. I'm going to try downloading the same file in debug mode today and I'll let you know how it goes. @jordank
Check with Abby. She saw exactly the same failure, with multiple restarts, using the command line client. I don't know about her client-side configuration. Might be a clue there.
A docker container unless otherwise specified will be able to use all the CPUs of the host machine, and so running the synapse python client inside a container would have the same download concurrency as running on the host machine and so could have the same outcome if this is the cause. This theory does not straightforwardly explain why you might be experiencing the same phenomenon in synapser, however. @jordank
I saw the same phenomenon with single-threaded synapser.
I see the same phenomenon with your Docker image on my macos laptop, with 4 cores.
I am not sure your analysis covers all these circumstances. But I do appreciate your efforts!
@paul-shannon
Thanks, the full stack is a bit more illuminating and confirms that the request on which the connection reset occurred was directly between the Python client and AWS S3. Most Synapse stored files, including this one, are stored in S3. Requests to Synapse infrastructure handle the bookkeeping, while the actual bytes are transferred directly between the clients and S3. We don't have much control over how S3 handles it connections (and don't have logs specifically that would explain the reason for S3 to reset a connection).
I had initially thought that this was not concurrency related since in your initial post you indicated that this error was also occurring in synapser, and synapser is single threaded (as is R).
However given the above stack and the fact that your machine has 88 processors is making me think that is is related to the concurrency of the download after all. The Python client is multi threaded and downloads multiple parts of the requested file concurrently, with the default concurrency scaling according to the number of processors on the machine. A connection reset at this point in the stack might indicate that the client was not able to read bytes from all of its open connections in time before S3 began resetting some of the connections. When you indicated in an above post that your machine had 88 processors, I tried to reproduce this on a EC2 M5.24xlarge with 96 processors, and had no issues, however an EC2 instance might have different performance characteristics given its throughput to S3. It might be that there is some constraint here that is preventing this number of download parts from being served concurrently on your machine (relative bandwidth, ulimit on the number of threads per process or requests, differences in how the threads are scheduled)?
If this was the cause, the number concurrency of a download can be explicitly lowered e.g.
```
syn = synapseclient.login()
syn.max_threads = 10
x = syn.get('syn11714133', downloadLocation=".")
```
If this is the explanation then a lower concurrency could allow the client to serve all its threads successfully. max_threads can also be set through the [.synapseConfig configuration file](https://python-docs.synapse.org/build/html/news.html#id16).synapsConfig configuration file.
Again, I'm not able to repro this, but I think it is the best explanation I can think of right now given the evidence. @jordank Here's another thought. What do your server logs say? Maybe some insight can be gleaned there. @jordank
By "run a dozen clients" I mean only, "Can you stress test your server? Can you do so (maybe you already are) from a remote host?"
My big beefy linux box has 64 cores, copious memory, lots of disk space, and was under very light loads during my test. ## "Connection reset by peer"
Which in this case seems to be that your server closed the connection. Perhaps the client has a retry loop? That's what I'd guess - and that the server keeps failing.
My diagnosis may be all wrong.
- Paul
synapsePythonClient> synapse --debug get syn11714133
synapse --debug get syn11714133
Synapse username (leave blank if using an auth token): paul-shannon
paul-shannon
Password, api key, or auth token for user paul-shannon
Welcome, Paul Shannon!
2021-03-23 00:47:03,371 [client:428 - INFO]: Welcome, Paul Shannon!
Downloading [####----------------]22.06% 3.3GB/14.9GB (5.7MB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_19.recalibrated_variants.vcf.gz.synapse_download_74048232 2021-03-23 00:56:57,986 [client:1857 - DEBUG]:
Retrying download on error: [] after progressing 0 bytes
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/urllib3-1.26.3-py3.6.egg/urllib3/response.py", line 438, in _error_catcher
yield
File "/usr/local/lib/python3.6/dist-packages/urllib3-1.26.3-py3.6.egg/urllib3/response.py", line 519, in read
data = self._fp.read(amt) if not fp_closed else b""
File "/usr/lib/python3.6/http/client.py", line 463, in read
n = self.readinto(b)
File "/usr/lib/python3.6/http/client.py", line 507, in readinto
n = self.fp.readinto(b)
File "/usr/lib/python3.6/socket.py", line 586, in readinto
return self._sock.recv_into(b)
File "/usr/lib/python3.6/ssl.py", line 1012, in recv_into
return self.read(nbytes, buffer)
File "/usr/lib/python3.6/ssl.py", line 874, in read
return self._sslobj.read(len, buffer)
File "/usr/lib/python3.6/ssl.py", line 631, in read
v = self._sslobj.read(len, buffer)
ConnectionResetError: [Errno 104] Connection reset by peer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/requests-2.25.1-py3.6.egg/requests/models.py", line 753, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "/usr/local/lib/python3.6/dist-packages/urllib3-1.26.3-py3.6.egg/urllib3/response.py", line 576, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "/usr/local/lib/python3.6/dist-packages/urllib3-1.26.3-py3.6.egg/urllib3/response.py", line 541, in read
raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
File "/usr/lib/python3.6/contextlib.py", line 99, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.6/dist-packages/urllib3-1.26.3-py3.6.egg/urllib3/response.py", line 455, in _error_catcher
raise ProtocolError("Connection broken: %r" % e, e)
urllib3.exceptions.ProtocolError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/synapseclient-2.3.0-py3.6.egg/synapseclient/client.py", line 1840, in _downloadFileHandle
expected_md5=fileHandle.get('contentMd5'))
File "/usr/local/lib/python3.6/dist-packages/synapseclient-2.3.0-py3.6.egg/synapseclient/client.py", line 1881, in _download_from_url_multi_threaded
multithread_download.download_file(self, request)
File "/usr/local/lib/python3.6/dist-packages/synapseclient-2.3.0-py3.6.egg/synapseclient/core/multithread_download/download_threads.py", line 232, in download_file
downloader.download_file(download_request)
File "/usr/local/lib/python3.6/dist-packages/synapseclient-2.3.0-py3.6.egg/synapseclient/core/multithread_download/download_threads.py", line 297, in download_file
self._write_chunks(request, completed_futures, transfer_status)
File "/usr/local/lib/python3.6/dist-packages/synapseclient-2.3.0-py3.6.egg/synapseclient/core/multithread_download/download_threads.py", line 372, in _write_chunks
chunk_data = chunk_response.content
File "/usr/local/lib/python3.6/dist-packages/requests-2.25.1-py3.6.egg/requests/models.py", line 831, in content
self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
File "/usr/local/lib/python3.6/dist-packages/requests-2.25.1-py3.6.egg/requests/models.py", line 756, in generate
raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer'))
Downloading [#-------------------]3.24% 496.0MB/14.9GB (6.1MB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_19.recalibrated_variants.vcf.gz.synapse_download_74048232 data = self._fp.read(amt) if not fp_closed else b""
File "/usr/lib/python3.6/http/client.py", line 463, in read
n = self.readinto(b)
File "/usr/lib/python3.6/http/client.py", line 507, in readinto
n = self.fp.readinto(b)
File "/usr/lib/python3.6/socket.py", line 586, in readinto
return self._sock.recv_into(b)
File "/usr/lib/python3.6/ssl.py", line 1012, in recv_into
return self.read(nbytes, buffer)
File "/usr/lib/python3.6/ssl.py", line 874, in read
return self._sslobj.read(len, buffer)
File "/usr/lib/python3.6/ssl.py", line 631, in read
v = self._sslobj.read(len, buffer)
ConnectionResetError: [Errno 104] Connection reset by peer
During handling of the above exception, another exception occurred:
@paul-shannon Can you clarify what you mean by running a dozen clients? Are you also running some other activity that isn't represented by the syn.get download? @jordank Is there any chance you could fire up a dozen clients yourself, try to reproduce the error on your end?
Maybe it is not out of place for me to recall that Abby saw this same problem, thus suggesting it is not specific to my use...
I will go ahead with the approach you suggest. Let me know if you will be trying some research also.
- Paul Sorry for all the back and forth, but can you try the command line with debug option download (either in the container or outside since you seem to be getting a similar reset either way).
The download starting over indicates that some unexpected exception was encountered forcing the restart. Using the command line option with the debug should cause it to log that exception to the console.
```
synapse --debug get syn11714133
```
The docker 14.9G vcf.gz download just spontaneously restarted after being about 40% complete.
>>> x = syn.get('syn11714133', downloadLocation="/tmp")
x = syn.get('syn11714133', downloadLocation="/tmp")
Downloading [#-------------------]6.32% 968.0MB/14.9GB (2.3MB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_19.recalibrated_variants.vcf.gz.synapse_download_74048232
This is with your latest docker image:
sagebionetworks/synapsepythonclient latest 5272bfe96931 11 days ago 211MB
I cannot be certain, but I have no reason to believe that the ISB's fast internet pipe is at fault.
@jordank
I am now 33% complete on a 15G vcf.gz file, so I am optimistic.
In case a future reader is curious about the specifics of the docker "syn.get", here are my notes
```
docker run -it \
--entrypoint /bin/bash \
-v /tmp/docker-paul:/tmp \
sagebionetworks/synapsepythonclient
--- sample bash and python run within synapsepythonclient container
export PS1='DOCKER.\W> '
python3
import synapseclient
syn = synapseclient.login("paul-shannon", password="passwordGoesHere"))
x = syn.get('syn1899498', downloadLocation="/tmp") # quick and easy
```
I'll give it a try.
@paul-shannon
I did not push the image but the Dockerfile and its context files are linked in the [gist](https://gist.github.com/jkiang13/967d8140b32ef25c5718c16c07577b10#file-dockerfile). That image was me attempting to replicate your environment however. You can instead pull the synapseclient docker image on [Docker Hub](https://hub.docker.com/r/sagebionetworks/synapsepythonclient), e.g.
```
docker pull sagebionetworks/synapsepythonclient
```
Running the client from within a container run from that image would be a clean install without any other dependencies and would be a good data point if it reproduces the issue or completes successfully. @jordank Perhaps a better approach, less fuss, faster resolution, would be for me to use your docker image. Is it publicly available? @paul-shannon
I created the following [Dockerfile](https://gist.github.com/jkiang13/967d8140b32ef25c5718c16c07577b10#file-dockerfile) as an attempt to recreate the environment above as close as I could, but could not replicate the error, which suggests the issue it's not related to an particular interfering dependency or something else environmental that was replicated in the Docker image.
Could you try running the download from the command line using the debug option? If the download progress bar is being reset it suggests an intermediate exception that synapse is trying to recover from. Running the command line download with the debug option should log any intermediate errors that caused the exception to the console before restarting the download.
e.g.
```
synapse --debug get syn11714133
```
Here you go. (I looked for a file upload capability here, without success. It'd be a handy feature. What follows is from cut and paste.)
- Paul
haleesi.vcf> synapse --version
Synapse Client 2.3.0
khaleesi.vcf> conda list
# packages in environment at /users/pshannon/anaconda3:
#
# Name Version Build Channel
_ipyw_jlab_nb_ext_conf 0.1.0 py37_0
alabaster 0.7.12 py37_0
anaconda 2018.12 py37_0
anaconda-client 1.7.2 py37_0
anaconda-navigator 1.9.6 py37_0
anaconda-project 0.8.2 py37_0
asn1crypto 0.24.0 py37_0
astroid 2.1.0 py37_0
astropy 3.1 py37h7b6447c_0
atomicwrites 1.2.1 py37_0
attrs 18.2.0 py37h28b3542_0
babel 2.6.0 py37_0
backcall 0.1.0 py37_0
backports 1.0 py37_1
backports.os 0.1.1 py37_0
backports.shutil_get_terminal_size 1.0.0 py37_2
beautifulsoup4 4.6.3 py37_0
bitarray 0.8.3 py37h14c3975_0
bkcharts 0.2 py37_0
blas 1.0 mkl
blaze 0.11.3 py37_0
bleach 3.0.2 py37_0
blosc 1.14.4 hdbcaa40_0
bokeh 1.0.2 py37_0
boto 2.49.0 py37_0
bottleneck 1.2.1 py37h035aef0_1
bzip2 1.0.6 h14c3975_5
ca-certificates 2018.03.07 0
cairo 1.14.12 h8948797_3
certifi 2018.11.29 py37_0
cffi 1.11.5 py37he75722e_1
chardet 3.0.4 py37_1
click 7.0 py37_0
cloudpickle 0.6.1 py37_0
clyent 1.2.2 py37_1
colorama 0.4.1 py37_0
conda 4.5.12 py37_0
conda-build 3.17.6 py37_0
conda-env 2.6.0 1
conda-verify 3.1.1 py37_0
contextlib2 0.5.5 py37_0
cryptography 2.4.2 py37h1ba5d50_0
curl 7.63.0 hbc83047_1000
cycler 0.10.0 py37_0
cython 0.29.2 py37he6710b0_0
cytoolz 0.9.0.1 py37h14c3975_1
dask 1.0.0 py37_0
dask-core 1.0.0 py37_0
datashape 0.5.4 py37_1
dbus 1.13.2 h714fa37_1
decorator 4.3.0 py37_0
defusedxml 0.5.0 py37_1
Deprecated 1.2.12
distributed 1.25.1 py37_0
docutils 0.14 py37_0
entrypoints 0.2.3 py37_2
et_xmlfile 1.0.1 py37_0
expat 2.2.6 he6710b0_0
fastcache 1.0.2 py37h14c3975_2
filelock 3.0.10 py37_0
flask 1.0.2 py37_1
flask-cors 3.0.7 py37_0
fontconfig 2.13.0 h9420a91_0
freetype 2.9.1 h8a8886c_1
fribidi 1.0.5 h7b6447c_0
future 0.17.1 py37_0
get_terminal_size 1.0.0 haa9412d_0
gevent 1.3.7 py37h7b6447c_1
glib 2.56.2 hd408876_0
glob2 0.6 py37_1
gmp 6.1.2 h6c8ec71_1
gmpy2 2.0.8 py37h10f8cd9_2
graphite2 1.3.12 h23475e2_2
greenlet 0.4.15 py37h7b6447c_0
gst-plugins-base 1.14.0 hbbd80ab_1
gstreamer 1.14.0 hb453b48_1
h5py 2.8.0 py37h989c5e5_3
harfbuzz 1.8.8 hffaf4a1_0
hdf5 1.10.2 hba1933b_1
heapdict 1.0.0 py37_2
html5lib 1.0.1 py37_0
icu 58.2 h9c2bf20_1
idna 2.8 py37_0
imageio 2.4.1 py37_0
imagesize 1.1.0 py37_0
importlib_metadata 0.6 py37_0
intel-openmp 2019.1 144
ipykernel 5.1.0 py37h39e3cac_0
ipython 7.2.0 py37h39e3cac_0
ipython_genutils 0.2.0 py37_0
ipywidgets 7.4.2 py37_0
isort 4.3.4 py37_0
itsdangerous 1.1.0 py37_0
jbig 2.1 hdba287a_0
jdcal 1.4 py37_0
jedi 0.13.2 py37_0
jeepney 0.4 py37_0
jinja2 2.10 py37_0
jpeg 9b h024ee3a_2
jsonschema 2.6.0 py37_0
jupyter 1.0.0 py37_7
jupyter_client 5.2.4 py37_0
jupyter_console 6.0.0 py37_0
jupyter_core 4.4.0 py37_0
jupyterlab 0.35.3 py37_0
jupyterlab_server 0.2.0 py37_0
keyring 12.0.2
keyrings.alt 3.1
kiwisolver 1.0.1 py37hf484d3e_0
krb5 1.16.1 h173b8e3_7
lazy-object-proxy 1.3.1 py37h14c3975_2
libarchive 3.3.3 h5d8350f_5
libcurl 7.63.0 h20c2e04_1000
libedit 3.1.20170329 h6b74fdf_2
libffi 3.2.1 hd88cf55_4
libgcc-ng 8.2.0 hdf63c60_1
libgfortran-ng 7.3.0 hdf63c60_0
liblief 0.9.0 h7725739_1
libpng 1.6.35 hbc83047_0
libsodium 1.0.16 h1bed415_0
libssh2 1.8.0 h1ba5d50_4
libstdcxx-ng 8.2.0 hdf63c60_1
libtiff 4.0.9 he85c1e1_2
libtool 2.4.6 h7b6447c_5
libuuid 1.0.3 h1bed415_2
libxcb 1.13 h1bed415_1
libxml2 2.9.8 h26e45fe_1
libxslt 1.1.32 h1312cb7_0
llvmlite 0.26.0 py37hd408876_0
locket 0.2.0 py37_1
lxml 4.2.5 py37hefd8a0e_0
lz4-c 1.8.1.2 h14c3975_0
lzo 2.10 h49e0be7_2
markupsafe 1.1.0 py37h7b6447c_0
matplotlib 3.0.2 py37h5429711_0
mccabe 0.6.1 py37_1
mistune 0.8.4 py37h7b6447c_0
mkl 2019.1 144
mkl-service 1.1.2 py37he904b0f_5
mkl_fft 1.0.6 py37hd81dba3_0
mkl_random 1.0.2 py37hd81dba3_0
more-itertools 4.3.0 py37_0
mpc 1.1.0 h10f8cd9_1
mpfr 4.0.1 hdf1c602_3
mpmath 1.1.0 py37_0
msgpack-python 0.5.6 py37h6bb024c_1
multipledispatch 0.6.0 py37_0
navigator-updater 0.2.1 py37_0
nbconvert 5.4.0 py37_1
nbformat 4.4.0 py37_0
ncurses 6.1 he6710b0_1
networkx 2.2 py37_1
nltk 3.4 py37_1
nose 1.3.7 py37_2
notebook 5.7.4 py37_0
numba 0.41.0 py37h962f231_0
numexpr 2.6.8 py37h9e4a6bb_0
numpy 1.15.4 py37h7e9f1db_0
numpy-base 1.15.4 py37hde5b4d6_0
numpydoc 0.8.0 py37_0
odo 0.5.1 py37_0
olefile 0.46 py37_0
openpyxl 2.5.12 py37_0
openssl 1.1.1a h7b6447c_0
packaging 18.0 py37_0
pandas 0.23.4 py37h04863e7_0
pandoc 1.19.2.1 hea2e7c5_1
pandocfilters 1.4.2 py37_1
pango 1.42.4 h049681c_0
parso 0.3.1 py37_0
partd 0.3.9 py37_0
patchelf 0.9 he6710b0_3
path.py 11.5.0 py37_0
pathlib2 2.3.3 py37_0
patsy 0.5.1 py37_0
pcre 8.42 h439df22_0
pep8 1.7.1 py37_0
pexpect 4.6.0 py37_0
pickleshare 0.7.5 py37_0
pillow 5.3.0 py37h34e0f95_0
pip 21.0.1
pip 18.1 py37_0
pixman 0.34.0 hceecf20_3
pkginfo 1.4.2 py37_1
pluggy 0.8.0 py37_0
ply 3.11 py37_0
prometheus_client 0.5.0 py37_0
prompt_toolkit 2.0.7 py37_0
psutil 5.4.8 py37h7b6447c_0
ptyprocess 0.6.0 py37_0
py 1.7.0 py37_0
py-lief 0.9.0 py37h7725739_1
pycodestyle 2.4.0 py37_0
pycosat 0.6.3 py37h14c3975_0
pycparser 2.19 py37_0
pycrypto 2.6.1 py37h14c3975_9
pycurl 7.43.0.2 py37h1ba5d50_0
pyflakes 2.0.0 py37_0
pygments 2.3.1 py37_0
pylint 2.2.2 py37_0
pyodbc 4.0.25 py37he6710b0_0
pyopenssl 18.0.0 py37_0
pyparsing 2.3.0 py37_0
pyqt 5.9.2 py37h05f1152_2
pysocks 1.6.8 py37_0
pytables 3.4.4 py37ha205bf6_0
pytest 4.0.2 py37_0
pytest-arraydiff 0.3 py37h39e3cac_0
pytest-astropy 0.5.0 py37_0
pytest-doctestplus 0.2.0 py37_0
pytest-openfiles 0.3.1 py37_0
pytest-remotedata 0.3.1 py37_0
python 3.7.1 h0371630_7
python-dateutil 2.7.5 py37_0
python-libarchive-c 2.8 py37_6
pytz 2018.7 py37_0
pywavelets 1.0.1 py37hdd07704_0
pyyaml 3.13 py37h14c3975_0
pyzmq 17.1.2 py37h14c3975_0
qt 5.9.7 h5867ecd_1
qtawesome 0.5.3 py37_0
qtconsole 4.4.3 py37_0
qtpy 1.5.2 py37_0
readline 7.0 h7b6447c_5
requests 2.25.1
requests 2.21.0 py37_0
rope 0.11.0 py37_0
ruamel_yaml 0.15.46 py37h14c3975_0
scikit-image 0.14.1 py37he6710b0_0
scikit-learn 0.20.1 py37hd81dba3_0
scipy 1.1.0 py37h7c811a0_2
seaborn 0.9.0 py37_0
SecretStorage 2.3.1
secretstorage 3.1.0 py37_0
send2trash 1.5.0 py37_0
setuptools 40.6.3 py37_0
simplegeneric 0.8.1 py37_2
singledispatch 3.4.0.3 py37_0
sip 4.19.8 py37hf484d3e_0
six 1.12.0 py37_0
snappy 1.1.7 hbae5bb6_3
snowballstemmer 1.2.1 py37_0
sortedcollections 1.0.1 py37_0
sortedcontainers 2.1.0 py37_0
sphinx 1.8.2 py37_0
sphinxcontrib 1.0 py37_1
sphinxcontrib-websupport 1.1.0 py37_1
spyder 3.3.2 py37_0
spyder-kernels 0.3.0 py37_0
sqlalchemy 1.2.15 py37h7b6447c_0
sqlite 3.26.0 h7b6447c_0
statsmodels 0.9.0 py37h035aef0_0
sympy 1.3 py37_0
synapseclient 2.3.0
tblib 1.3.2 py37_0
terminado 0.8.1 py37_1
testpath 0.4.2 py37_0
tk 8.6.8 hbc83047_0
toolz 0.9.0 py37_0
tornado 5.1.1 py37h7b6447c_0
tqdm 4.28.1 py37h28b3542_0
traitlets 4.3.2 py37_0
unicodecsv 0.14.1 py37_0
unixodbc 2.3.7 h14c3975_0
urllib3 1.24.1 py37_0
wcwidth 0.1.7 py37_0
webencodings 0.5.1 py37_1
werkzeug 0.14.1 py37_0
wheel 0.32.3 py37_0
widgetsnbextension 3.4.2 py37_0
wrapt 1.10.11 py37h14c3975_2
wurlitzer 1.0.2 py37_0
xlrd 1.2.0 py37_0
xlsxwriter 1.1.2 py37_0
xlwt 1.3.0 py37_0
xz 5.2.4 h14c3975_4
yaml 0.1.7 had09818_2
zeromq 4.2.5 hf484d3e_1
zict 0.1.3 py37_0
zlib 1.2.11 h7b6447c_3
zstd 1.3.7 h0b5b093_0
khaleesi.vcf> pip list
Package Version
---------------------------------- ----------
alabaster 0.7.12
anaconda-client 1.7.2
anaconda-navigator 1.9.6
anaconda-project 0.8.2
asn1crypto 0.24.0
astroid 2.1.0
astropy 3.1
atomicwrites 1.2.1
attrs 18.2.0
Babel 2.6.0
backcall 0.1.0
backports.os 0.1.1
backports.shutil-get-terminal-size 1.0.0
beautifulsoup4 4.6.3
bitarray 0.8.3
bkcharts 0.2
blaze 0.11.3
bleach 3.0.2
bokeh 1.0.2
boto 2.49.0
Bottleneck 1.2.1
certifi 2018.11.29
cffi 1.11.5
chardet 3.0.4
Click 7.0
cloudpickle 0.6.1
clyent 1.2.2
colorama 0.4.1
conda 4.5.12
conda-build 3.17.6
conda-verify 3.1.1
contextlib2 0.5.5
cryptography 2.4.2
cycler 0.10.0
Cython 0.29.2
cytoolz 0.9.0.1
dask 1.0.0
datashape 0.5.4
decorator 4.3.0
defusedxml 0.5.0
Deprecated 1.2.12
distributed 1.25.1
docutils 0.14
entrypoints 0.2.3
et-xmlfile 1.0.1
fastcache 1.0.2
filelock 3.0.10
Flask 1.0.2
Flask-AutoIndex 0.6.2
Flask-Cors 3.0.7
Flask-Silk 0.2
future 0.17.1
gevent 1.3.7
glob2 0.6
gmpy2 2.0.8
greenlet 0.4.15
h5py 2.8.0
heapdict 1.0.0
html5lib 1.0.1
idna 2.8
imageio 2.4.1
imagesize 1.1.0
importlib-metadata 0.6
ipykernel 5.1.0
ipython 7.2.0
ipython-genutils 0.2.0
ipywidgets 7.4.2
isort 4.3.4
itsdangerous 1.1.0
jdcal 1.4
jedi 0.13.2
jeepney 0.4
Jinja2 2.10
jsonschema 2.6.0
jupyter 1.0.0
jupyter-client 5.2.4
jupyter-console 6.0.0
jupyter-core 4.4.0
jupyterlab 0.35.3
jupyterlab-server 0.2.0
keyring 12.0.2
keyrings.alt 3.1
kiwisolver 1.0.1
lazy-object-proxy 1.3.1
libarchive-c 2.8
lief 0.9.0
llvmlite 0.26.0
locket 0.2.0
lxml 4.2.5
MarkupSafe 1.1.0
matplotlib 3.0.2
mccabe 0.6.1
mistune 0.8.4
mkl-fft 1.0.6
mkl-random 1.0.2
more-itertools 4.3.0
mpmath 1.1.0
msgpack 0.5.6
multipledispatch 0.6.0
navigator-updater 0.2.1
nbconvert 5.4.0
nbformat 4.4.0
networkx 2.2
nltk 3.4
nose 1.3.7
notebook 5.7.4
numba 0.41.0
numexpr 2.6.8
numpy 1.15.4
numpydoc 0.8.0
odo 0.5.1
olefile 0.46
openpyxl 2.5.12
packaging 18.0
pandas 0.23.4
pandocfilters 1.4.2
parso 0.3.1
partd 0.3.9
path.py 11.5.0
pathlib2 2.3.3
patsy 0.5.1
pep8 1.7.1
pexpect 4.6.0
pickleshare 0.7.5
Pillow 5.3.0
pip 21.0.1
pkginfo 1.4.2
pluggy 0.8.0
ply 3.11
prometheus-client 0.5.0
prompt-toolkit 2.0.7
psutil 5.4.8
ptyprocess 0.6.0
py 1.7.0
pycodestyle 2.4.0
pycosat 0.6.3
pycparser 2.19
pycrypto 2.6.1
pycurl 7.43.0.2
pyflakes 2.0.0
Pygments 2.3.1
pylint 2.2.2
pyodbc 4.0.25
pyOpenSSL 18.0.0
pyparsing 2.3.0
PySocks 1.6.8
pytest 4.0.2
pytest-arraydiff 0.3
pytest-astropy 0.5.0
pytest-doctestplus 0.2.0
pytest-openfiles 0.3.1
pytest-remotedata 0.3.1
python-dateutil 2.7.5
pytz 2018.7
PyWavelets 1.0.1
PyYAML 3.13
pyzmq 17.1.2
QtAwesome 0.5.3
qtconsole 4.4.3
QtPy 1.5.2
requests 2.25.1
rope 0.11.0
ruamel-yaml 0.15.46
scikit-image 0.14.1
scikit-learn 0.20.1
scipy 1.1.0
seaborn 0.9.0
SecretStorage 2.3.1
Send2Trash 1.5.0
setuptools 40.6.3
simplegeneric 0.8.1
singledispatch 3.4.0.3
six 1.12.0
snowballstemmer 1.2.1
sortedcollections 1.0.1
sortedcontainers 2.1.0
Sphinx 1.8.2
sphinxcontrib-websupport 1.1.0
spyder 3.3.2
spyder-kernels 0.3.0
SQLAlchemy 1.2.15
statsmodels 0.9.0
sympy 1.3
synapseclient 2.3.0
tables 3.4.4
tblib 1.3.2
terminado 0.8.1
testpath 0.4.2
toolz 0.9.0
tornado 5.1.1
tqdm 4.28.1
traitlets 4.3.2
unicodecsv 0.14.1
urllib3 1.24.1
wcwidth 0.1.7
webencodings 0.5.1
Werkzeug 0.14.1
wheel 0.32.3
widgetsnbextension 3.4.2
wrapt 1.10.11
wurlitzer 1.0.2
xlrd 1.2.0
XlsxWriter 1.1.2
xlwt 1.3.0
zict 0.1.3
khaleesi.vcf> lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.4 LTS
Release: 18.04
Codename: bionic
khaleesi.vcf> openssl version
OpenSSL 1.1.1a 20 Nov 2018
khaleesi.vcf> df -k .
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/khaleesi--vg-local 8587847648 5342818616 3245029032 63% /local
khaleesi.vcf> khaleesi.vcf> khaleesi.vcf> cat /proc/cpuinfo | grep processor | wc -l
88
Hi @paul-shannon,
I tried downloading this particular file in a few Ubuntu environments (e.g. a 20.04 container from my home connection and an 18.04 EC2 instance) and was not immediately able to reproduce this, the download completing successfully, e.g.
```
(py371_synapse230) ubuntu@ip-10-11-58-208:~/venvs$ python
Python 3.7.1 (default, Mar 20 2021, 18:54:52)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import synapseclient
>>> syn = synapseclient.login()
Welcome, Jordan K!!
>>> x = syn.get('syn11714133', downloadLocation=".")
Downloading [####################]100.00% 14.9GB/14.9GB (141.7MB/s) NIA_JG_1898_samples_GRM_WGS_b37_JointAnalysis01_2017-12-08_19.recalibrated_variants.vcf.gz.synapse_download_22409246 Done...
```
Could you show the output of the following commands so that I can try reproducing this in a more exact environment?
```
synapse --version
# if you are running in a conda environment
conda list
pip list
lsb_release -a
openssl version
df -k .
cat /proc/cpuinfo | grep processor | wc -l
```
Older versions of the synapseclient (< 2.2) could exhibit download restarts when the available disk space on the download volume was exhausted, but if you are running the latest version (2.3.0) this should not be the issue.
Thanks! @jordank could you please have a look?
Drop files to upload
multiple failures downloading 14G vcf file page is loading…