Hello,
I am interested in using the snATAC-Seq data (syn52146321). I could not find a "specimen_suffix" column like there had been in the multiome files to match the files to the sample metadata. How would I identify the fragment files' samples of origin and merge the information with patient/sample metadata for the snATAC-Seq data?
Thank you in advance for your help!
Best,
Zane H
Created by Zane Hamdan zhamdan1252 Hello! Each fragments file is annotated with things like `individualID` and `specimenID`, which should let you match it up to the. metadata. There are a few ways to bulk download this information, but I find that this is the easiest:
I went to the AD Knowledge Portal's data browser and filtered to all snATACseq data from SEA-AD, gzip files only. [Link to pre-filled query](https://adknowledgeportal.synapse.org/Explore/Data?QueryWrapper0=%7B%22sql%22%3A%22SELECT+*+FROM+syn11346063.69%22%2C%22limit%22%3A25%2C%22selectedFacets%22%3A%5B%7B%22concreteType%22%3A%22org.sagebionetworks.repo.model.table.FacetColumnValuesRequest%22%2C%22columnName%22%3A%22study%22%2C%22facetValues%22%3A%5B%22SEA-AD%22%5D%7D%2C%7B%22concreteType%22%3A%22org.sagebionetworks.repo.model.table.FacetColumnValuesRequest%22%2C%22columnName%22%3A%22assay%22%2C%22facetValues%22%3A%5B%22snATACSeq%22%5D%7D%2C%7B%22concreteType%22%3A%22org.sagebionetworks.repo.model.table.FacetColumnValuesRequest%22%2C%22columnName%22%3A%22fileFormat%22%2C%22facetValues%22%3A%5B%22gzip%22%5D%7D%5D%7D)
Then you can click the download button on the top-right and export it as a .csv, or it will tell you how to do this same thing programmatically. The downloaded file will give you which individual IDs match to which file names / synapse IDs.
You could also use `syn.get()` (Python) or `synGet()` (R) on individual files, and the returned object will also have this same information for that file only.
I hope that helps! Let me know if you have any more questions,
Jaclyn Beck **note**: I could be missing a much more efficient way of doing this, so if someone could correct me, please do!
In the sample manifest data for the fragment files, there are fields for `specimenID` and `individualID`. These can identify the various metadata csv files available in directory [`syn28256462`](https://www.synapse.org/Synapse:syn28256462)
A combined sample manifest is generated if you download an entire synapse directory with `synapse get`. Unfortunately all of the bam files are included as well, so that would make for a very hefty download.
If you only want to download the fragments files and still get the sample manifest, I've had success with `synapse show` using the synapse command-line tool. The output is a bit less machine-readable than the csv sample manifest, but it can be managed with some bash scripting.
You can confirm this by running `synapse show synNNNNNNNN` for one of the fragments files and examining the output and comparing to metadata csv files in [`syn28256462`](https://www.synapse.org/Synapse:syn28256462)
To get the data for all the files, you could create a single-column file with a list of synapse IDs of the files you want, and execute a loop:
```
while read syn; do
synapse show ${syn} > ${syn}_metadata.txt
done < syn_id_list.txt
```
and then follow up by parsing each of the outputs into a single csv.
Drop files to upload
Match SEA-AD ATAC-Seq to Fragment Files to Patient/Sample Metadata page is loading…