Dear AD Knowledge Portal users,
Please if you could provide guidance on my questions below, I would really appreciate your help.
I had my application for access to the RNA-Seq data from ROSMAP approved, however, I have not yet downloaded the data as it is unclear to me whether the RNA-Seq and gene expression data available through Synapse should be treated as patient data and what safety practices should be put in place to the server that stores the data. Similarly, it is unclear to me what restrictions apply to downstream processed data, e.g. results from differential expression or other analyses. I am aware that all data should be deleted after the project finishes.
Could you please provide guidance on how the data should be processed and whether these restrictions apply to all downstream processed data? Could you please provide examples of compliant ways of processing the data, e.g. whether a separate EC2 instance with restricted access would be considered compliant?
Related to this, I have seen that it is possible to apply for access to AD Knowledge Portal Analytical Workspace. Could you please outline the fees for using this space and rough billing guide? I am aware of $100 credit being available through the trial, however, extensive computational analyses may become costly.
Thank you very much in advance.
Kind regards,
Silvia
Created by Silvia Hnatova silvia.hnatova Great to hear it! Best of luck with your analyses -- reach out if you have any other questions. Hi @abby.vanderlinden, thank you very much for the clarifications and for your very helpful response and apologies for my delayed response!
My application for access to the Analytical Workspace has been approved and I have also clarified with my institution on whether our infrastructure would be considered compliant.
Thank you very much for your help.
Kind regards,
Silvia Hi @silvia.hnatova, thank you for these very thoughtful questions!
While the ROSMAP RNAseq data does not meet the definition for PHI under HIPAA, any individual-level data in the AD Portal such as clinical covariates, sequencing data, or gene counts is considered sensitive and cannot be reshared. Genomic summary results that do _not_ contain individual-level data are not considered sensitive -- more information is available [here](https://www.genome.gov/about-nhgri/Policies-Guidance/Genomic-Data-Sharing/frequently-asked-questions/GSR-update). Genomic summary results like differential expression analyses can therefore be shared, as long as there is no identifying information or any individual-level data (no sample IDs, etc).
Sage can't provide explicit guidance on secure data download and compute procedures, but you should talk to your institution about what compute infrastructure will be considered compliant. For example, institutions may have Business Agreements with cloud computing platforms like AWS or a dedicated secure computing cluster.
The Analytical Workspace is in beta and right now does not have a fee structure or a way to bill users. The $100 credit is a rough guideline, but is not an automatic cut-off -- although if a user incurs substantial charges we'll contact them and likely limit access. For more specific info on the Analytical Workspace , you can contact anna.greenwood@sagebionetworks.org.
Thanks for your patience,
Abby Related to my questions above, is all RNA-Seq data anonymised?