Dear Synapse team,
I would like to download / export genomics data from the [AMP-AD](https://www.synapse.org/#!Synapse:syn2580853/wiki/409853) project for analysis on AWS , e.g. outside of Synapse. (I already have the required permission to do so.)
I found [this page](http://docs.synapse.org/articles/custom_storage_location.html) in the documentation that describes how to set up an AWS S3 bucket.
It refers to the _s3FileCopy API _ for retrieving larger amounts of data, but I haven't been able to find more information about this tool.
It would be great if somebody could help me answering the following questions:
* What is the recommended way of transferring data out of a Synapse project into a private AWS S3 bucket?
* How can I ensure that the data remains encrypted during the transfer / at rest in the AWS S3 destination bucket?
Thanks a lot for any pointers,
Thomas
Created by Thomas Sandmann sandmannt Hi @pranavm716,
No, I have been using a two-step strategy: 1. copy the files from synapse to an EC2 instance (with the synapse CLI) and then 2. copy them from there into an S3 bucket. Just checking in to see if you were able to find a way to sync files directly to s3. I'd appreciate it if you let me know! Hi Larsson,
thanks a lot for your reply. I wasn't hoping to increase the speed of data access.
Instead, I would like to analyse the data using [DNANexus](https://platform.dnanexus.com/login), a commercial cloud provider. The pipelines defined within DNANexus can retrieve files from different sources, including URLs and S3 buckets, as long as the right credentials are provided. For my own S3 buckets, that's not a problem. But I haven't figured out how to teach DNANexus to retrieve files using the synapse CLI. I can request support from DNANexus for that, but in the meantime I thought I might as well copy the data into my own S3 bucket - and shuttle it into DNANexus from there. Obviously, a direct route would be preferred.
So far, I have tried to create a new project in my synapse account and assign a private S3 bucket as the storage location (as outlined in the [Synapse documentation](http://docs.synapse.org/articles/custom_storage_location.html)). When I add files to this project, they appear in my S3 bucket, as expected. Yet, when I try to link the AMP-AD dataset, e.g. by pointing to the Synapse id [syn4164376](https://www.synapse.org/#!Synapse:syn4164376) via the `Tools -> Save Link To This Page` menu item, I get the following error: `You do not have CREATE permission for the requested entity.`
This might not be the right way of going about it... I am new to both Synapse and DNANexus and would be happy to learn about ways to use the latter for processing data within Synapse.
Many thanks for any feedback,
Thomas
P.S.: Thanks you for pointing out that I can use an EC2 instance as an intermediate, I will try that route next. Hi Thomas:
The S3FileCopy command, I believe, is referring to the the AWS CLI command for copying files between S3 buckets. It wouldn't be relevant in your case as it is suggested to be used for setting up a requester pays bucket where you want to move your files between existing S3 buckets. In general, I am curious, why are you wanting to replicate the data into another S3 bucket? Synapse already stores the AMP-AD data in S3 in the US East region. I can't see any speed advantage of replicating the data unless you are wedded to working in another region than US East.
Also, are you interested in still using the Synapse clients for interacting with the data after you have moved it into a new S3 bucket?
As for specific your questions:
* I would use the [Python Client](http://docs.synapse.org/python/) or the [Command line](http://docs.synapse.org/python/CommandLineClient.html) client for downloading the data onto an EC2 machine for uploading to your S3 bucket. These clients use very similar settings to the AWS CLI for interacting with S3 and should see similar throughput as you would get with your own bucket.
* The clients use encryption for transit already. And you should enable encryption on your S3 bucket using instructions from AWS.
Drop files to upload
Exporting data into private S3 bucket page is loading…