Hi all,
I am attempting to download ~500 fastq files (mixture of I1, R1, R2 fastq's of single cell data, file sizes range from 100Mb to ~3Gb) using the command 'synapse get -r syn18638475' from the command line. Each time I run this command, some, but not all files are downloaded. I'm also assuming that by running this command again, it will attempt to pick up where it left off.
I have run this command many times over the past few days, and at this point I believe I have most of the files smaller than 1Gb, but it appears to struggle with files larger than 2Gb. I'm also seeing this message: 'Breaking lock whose age is: #######', does this mean that the synapse command is giving up on files that take a long time to download? I'm also confused because each time I run the command it reports that multiple Gb of data have been downloaded, but at this point the downloaded file list barely changes. Are there other options that I can add to the synapse command to fix this problem? I am downloading these directly to my university's server and running the command on the linux HPC cluster.
Best,
Brian
Created by Brian Herb brianrherb Hi @jordank ,
Thank you for your help. It appears that all files from syn18638475 have downloaded successfully using the new version of synapse client. I did not see any errors, just one possible warning at the start:
2021-04-25 09:15:45,224 [retry:100 - DEBUG]: Too many concurrent requests. Allowed 3 concurrent connections at any time.
Downloading ~504 Gb took about 3 hours and the node I was on had 48 processors.
Best,
Brian Hi @brianrherb
Could you try downloading the following prerelease version of the synapse client and then repeat the command using the --debug option:
```
# install pre-release version
pip3 install --upgrade --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple "synapseclient>=2.4.0.426"
# repeat command with --debug
synapse --debug get -r syn18638475
```
If any errors or stack traces appear during the above command could you reply with them here. Also could you let me know the number of processors on this cluster:
```
# get the number of processors
cat /proc/cpuinfo | grep processor | wc -l
```
I know of one other case where large number of CPUs from a cluster caused a download bottle neck (the synapseclient picks the level of concurrency based on the number of processors it has access to). Since you mention you are running in a university cluster I'm speculating that this may be a related case and if so hopefully the pre-release of the next version may resolve the issue.
Thanks. @jordank Could you please have a look?
Drop files to upload
Failure to download ~500 fastq files using 'synapse get' page is loading…