Hello all,
I am having trouble downloading 15GB+ files using the synapse client. The connection just closes before the file is finished downloading and then the synapse client complains about the md5 not matching. If I try the download about 20 times, it might finish downloading once but each time I only get about 8GB and have to start over each time.
This is not an issue with my internet connection. The connection is a wired 100Mb connection to a cluster, not wifi and I can download other large files from EC3 without issue.
Is there anyway to resume downloading partial files using the synapse client?
Is there a way to use the Aspera client? (recommended by my IT then talking with them about this)
or rsync client?
Has anyone else run into this issue before and solved it some other way?
Thanks,
Nathan
Example files I am having trouble with:
syn4212762
syn4212767
syn4212781
syn4212787
syn4212797
syn4212861
syn4212873
syn4212888
syn4212921
syn4212944
syn4212968
Created by Nathan Bihlmeyer nbihlmeyer I've opened an issue for our engineers to look into this further!
https://sagebionetworks.jira.com/browse/SYNPY-431
I'm doubting any OS-specific issue. As my post above showed, I was able to get the same result (albeit much less frequently than you are observing) of a download progressing to nearly 70% completion, but the client catching some error, expecting the end of the file, checking the md5, and restarting from the beginning. I think what @nbihlmeyer is (rightly) wondering is why didn't this restart from where it had progressed to instead of going from zero?
Edit: Of note is that my test that observed the same phenomenon was on an EC2 in us-east-1, the same region as Synapse storage. ```
$ synapse --debug get syn4212944
Downloading [############--------]60.50% 9.6GB/15.9GB (3.3MB/s) 583_120522.bam
Retrying download on Downloaded file /Users/me/Downloads/CATS/ROSMAP/583_120522.bam's md5 c2835474ac07f6c19b00cab33b38fa8c does not match expected MD5 of e9021748c1b1d31bbe2c7706b888a569 after progressing 0 bytes
Downloading [#############-------]63.08% 10.0GB/15.9GB (3.2MB/s) 583_120522.bam
Retrying download on Downloaded file /Users/me/Downloads/CATS/ROSMAP/583_120522.bam's md5 a3458626724bc1948e7874356fbc9101 does not match expected MD5 of e9021748c1b1d31bbe2c7706b888a569 after progressing 0 bytes
Downloading [#################---]84.60% 13.5GB/15.9GB (2.4MB/s) 583_120522.bam
Retrying download on Downloaded file /Users/me/Downloads/CATS/ROSMAP/583_120522.bam's md5 03b488213bc6a1d19320d0ce1ff746c6 does not match expected MD5 of e9021748c1b1d31bbe2c7706b888a569 after progressing 0 bytes
Downloading [#############-------]65.40% 10.4GB/15.9GB (1.7MB/s) 583_120522.bam
```
So the synapse client didn't work well on my cluster (see first two posts from me above). However, the synapse client does seem to work as expected on my Macbook (see my third post above). Either way the downloading of large files issue remains in both scenarios.
The code block above is a direct copy paste from my terminal where I was trying to download a file while at work to my Macbook. As you can see, the synapse client does retry on its own, over and over again forever (or at least until I want to go home).
I have been speaking with my cluster admins this whole time. They have not made any recommendations about trying the download from a different part of the cluster. Their recommendation (posted above) was to use the Aspera client (I know, they're being soooo helpful).
(Unrelated: On a previous cluster I worked on, they did have a download queue for long downloads. It was called the rnet queue, not to be confused with the download queue which was not for downloading from the internet. HA! What I am saying is I know what you're talking about.)
Thanks,
Nathan @nbihlmeyer, didn't mean that you would cancel the transfers, it sounded like the client was never restarting downloads for you I just wanted to make sure it was isolated to the case where the client thought the file was finished.
Best,
Larsson
P.S. Have you checked with your cluster admins, if there is preferred subnet or node that should be used for high bandwidth transfers. Usually something that is connected to internet 2 or some other backbone. From experience with other institutions, there usually is and the transfer rates can sometimes be multiple times higher. (The default TCP settings for linux for example are usually horrible for large transfers over long distances).
Hi Larsson,
Ctrl-C during a download and then rerunning the command to resume the download does work on my Macbook, however that does not solve the problem. As shown in my previous post, the synapse client thinks it is done downloading after getting 9.6GB of the 15.9GB for some reason. I do not know why. It just happens. Maybe the corporate firewall is adding some kind of TCP equivalent of an end of file into the connection for some reason and that is why the synapse client thinks it is done. The point is the file transfer never completes.
Unless you are telling be to manually Ctrl-C halfway through all these downloads and resume to see if manually breaking up the download into parts will make it work... That seems like a lot of manual work for hundreds of files. My current work around of automatically downloading files at home at night around 100GB at a time and then automatically transferring those files to the cluster in the background at work during the day at least has less manual steps to the process. Thanks for continuing to look into this for me. For the time being I am almost done downloading the ROSMAP RNAseq data I need, so this issue is not a top priority for me until I want to download the next RNAseq dataset from synapse in the future. I thank all of you who have helped me find a work around for this issue.
Thanks,
Nathan
After thoughts: I guess it wouldn't be hard to write a bash script to kill and restart a synapse client instance every hour. I'll try that when I go to download the next RNAseq dataset in a few months. Cheers Nathan:
Actually, your feature request is the way it is supposed to work. You can try to hit Ctrl-C when downloading a file then restart the download, it should use the partial download and restart. If this is not the case please respond I will explore and update.
Sorry for the delay. I got odd results from my tests and want to confirm before posting.
I tried downloading files to my Macbook instead of directly to the cluster (details about my Macbook are in my previous post).
On my home connection everything worked as expected. Even the download retries worked, which haven't ever worked on the cluster. My home connection is from Comcast and is 100Mb/s down, 5Mb/s up (Speedtest shows ~105/6). My Macbook was on my home wifi, which I believe is why the downloads were slow at ~5.5MB/s (~44Mb/s).
At work, my Macbook is connected via ethernet on a 100Mb/s down, 100Mb/s up connection (Speedtest shows ~93/93). Again, download retries worked on my Macbook. However the issue is the download would never complete and endless retries would ensue. It is also odd that the download was slower at less than 3.5MB/s (~28Mb/s).
At this point the issue seems to be with the Partners Corporate Network or Firewall, and I'll just download the few large files at home and ferry them to work (transfer from my Macbook to the cluster via samba works fine).
If I can make a feature request, it would be nice if the synapse client could recognize it did not download the whole file and just try downloading the last little bit before checking the md5 and without restarting from the beginning until after the md5 fails. <= this is how the rsync protocol works.
Right now the synapse client requires a file to be downloaded in one go, and that appears to be impossible for me at work.
Thanks,
Nathan
```
########### From Home ###########
$ synapse get syn4212781
Downloading [####################]100.00% 17.2GB/17.2GB (5.6MB/s) 398_120503.bam Done...
Downloaded file: 398_120503.bam
Creating /Users/me/Downloads/CATS/ROSMAP/398_120503.bam
$ synapse --debug get syn4212787
Downloading [####################]100.00% 13.9GB/13.9GB (5.5MB/s) 406_120503.bam Done...
Downloaded file: 406_120503.bam
Creating /Users/me/Downloads/CATS/ROSMAP/406_120503.bam
$ synapse --debug get syn4212797
Downloading [#########-----------]47.43% 8.6GB/18.1GB (5.0MB/s) 416_120503.bam
Retrying download on Downloaded file /Users/me/Downloads/CATS/ROSMAP/416_120503.bam's md5 74520bd274a958c5a5f727501ab2b2d9 does not match expected MD5 of c18f45a9a633e48a0e763560737a1124 after progressing 0 bytes
Downloading [####################]100.00% 18.1GB/18.1GB (5.6MB/s) 416_120503.bam Done...
Downloaded file: 416_120503.bam
Creating /Users/me/Downloads/CATS/ROSMAP/416_120503.bam
$ synapse --debug get syn4212861
Downloading [####################]100.00% 15.7GB/15.7GB (5.6MB/s) 492_120515.bam Done...
Downloaded file: 492_120515.bam
Creating /Users/me/Downloads/CATS/ROSMAP/492_120515.bam
$ synapse --debug get syn4212873
Downloading [####################]100.00% 10.7GB/10.7GB (5.5MB/s) 507_120515.bam Done...
Downloaded file: 507_120515.bam
Creating /Users/me/Downloads/CATS/ROSMAP/507_120515.bam
$ synapse --debug get syn4212888
Downloading [####################]100.00% 16.5GB/16.5GB (5.4MB/s) 525_120515.bam Done...
Downloaded file: 525_120515.bam
Creating /Users/me/Downloads/CATS/ROSMAP/525_120515.bam
$ synapse --debug get syn4212921
Downloading [####################]100.00% 16.0GB/16.0GB (5.5MB/s) 560_120517.bam Done...
Downloaded file: 560_120517.bam
Creating /Users/me/Downloads/CATS/ROSMAP/560_120517.bam
########### From Work ###########
$ synapse --debug get syn4212944
Downloading [############--------]60.50% 9.6GB/15.9GB (3.3MB/s) 583_120522.bam
Retrying download on Downloaded file /Users/me/Downloads/CATS/ROSMAP/583_120522.bam's md5 c2835474ac07f6c19b00cab33b38fa8c does not match expected MD5 of e9021748c1b1d31bbe2c7706b888a569 after progressing 0 bytes
Downloading [#############-------]63.08% 10.0GB/15.9GB (3.2MB/s) 583_120522.bam
Retrying download on Downloaded file /Users/me/Downloads/CATS/ROSMAP/583_120522.bam's md5 a3458626724bc1948e7874356fbc9101 does not match expected MD5 of e9021748c1b1d31bbe2c7706b888a569 after progressing 0 bytes
Downloading [#################---]84.60% 13.5GB/15.9GB (2.4MB/s) 583_120522.bam
Retrying download on Downloaded file /Users/me/Downloads/CATS/ROSMAP/583_120522.bam's md5 03b488213bc6a1d19320d0ce1ff746c6 does not match expected MD5 of e9021748c1b1d31bbe2c7706b888a569 after progressing 0 bytes
Downloading [#############-------]65.40% 10.4GB/15.9GB (1.7MB/s) 583_120522.bam
``` Got all files OK with no retries - getting about 12Mbps on my home wired connection. If you get any more logs from debug with errors, please post and I'll take a look. I tested on my home connection over wireless. I have some files that I have direct access through Amazon S3 as well as Synapse. I downloaded one with each - I got a 800MB file at 4.7 MBps directly from S3 using the AWS CLI. I got the same file through the Synapse client at about 3.5 MBps. I also get about 3.5 MBps if I get it from AWS over HTTPS directly.
I started downloading the files listed above on a wired connection through the same outbound connection. Getting about 8-9 MBps there - will let them go overnight and check the logs in the morning to see if anything happened similar to your experience.
Stay safe in the snow as well! Sure! I'll test on a more modest connection as well. I'm still unsure how or why your client isn't retrying automatically. Even in the worst case, it should start over on it's own from the beginning. When you run with `--debug`, can you post the full traceback of any error that is reported? Thanks! Thanks for looking into this for me.
It is a rare day that I see an average download of 4.0MB/s. 3.1MB/s is more common even though I would expect around 11MB/s on my 100Mbit connection (as seen on other non-synapse EC3 downloads). So maybe a test not on Amazon servers is needed. I wonder if the increase time needed for the slower speed causes an issue.
I have never seen the download retry without me explicitly rerunning the command, both with "synapse get -r syn3388564" or "synapse get syn4212762" (recursive or individual file). After the md5 mismatch, the synapse client always exits.
I will rerun with the --debug flag and see if I can get Python 3.4 on the server and use that. I will reply again tomorrow assuming the snow in Boston doesn't knockout the power here.
I have seen similar results on my Macbook Pro when using the ethernet at work, though the cluster I use and my office are on different campuses.
macOS Sierra 10.12.3
$ python -V
Python 2.7.13
$ synapse --version
Synapse Client 1.6.1
Thanks,
Nathan I tested on an Amazon EC2 (RHEL 7) with Python 3.4 (that was fastest to install without building from source). I downloaded 3 of those files (including one at 18GB) at an average of 60MB/s with no errors. However, one file (syn4212873) gave me trouble:
```bash
[ec2-user@ip-172-31-43-194 ~]$ synapse --debug get syn4212873
Downloading [##############------]69.92% 7.5GB/10.7GB (33.1MB/s) 507_120515.bam
Could not find a config file (/home/ec2-user/.synapseConfig). Using defaults.
Retrying download on Downloaded file /home/ec2-user/507_120515.bam's md5 aa4cb90239d86ef1815061937105c878 does not match expected MD5 of aaed98c35474c2c2356e15a9bb53da55 after progressing 0 bytes
Downloading [####################]100.00% 10.7GB/10.7GB (55.7MB/s) 507_120515.bam Done...
Downloaded file: 507_120515.bam
Creating /home/ec2-user/507_120515.bam
```
Though I'm a bit surprised about the error saying it had progressed 0 bytes when it hit an md5 mismatch (md5's are computed on chunks of downloads before being assembled into the final file). Also, the MD5 shown (aaed98c35474c2c2356e15a9bb53da55) is the final MD5. So, looks like what happened is the client determined that it was done downloading, exited, but checked and the final MD5 didn't match. However, the client recovered by starting the download over again (you can see error message was prefixed with `Retrying download`), and did not fail (I did not intervene at all). This is what should happen for you (though there may still be a bug somewhere, or at a minimum it's not reporting the right status and error).
Could it be because of Python 3.5.1? I'm surprised that you aren't getting retries. This version of the client does support partial downloads and should retry. I'll check back after I test it out! I'll test out some of those files and let you know. Yeah, that's the most recent version.
@larssono or @kdaily any thoughts on this? I am using the python client directly from bash
RHEL 6.5
Python 3.5.1
$ synapse --version
Synapse Client 1.6.1
I don't know where to look to find the current version... Hi Nathan,
Are you using the R or the python client? Which version are you using and do you have the most recent version?
Ben