Hi,
I have a few questions about the available WGS files :
1) Is there a source I could use to find out which WGS_ids (ie samples) were from blood and which were from brain?
2) Is there a file reporting all variants that could **_not_** be classified as germline?
3) Is there a way to link individual variant calls to individual samples in the available files?
Many thanks !
Created by mpetljak Hi @mpetljak - you can find more information about the ROSMAP WGS in this Scientific Data paper: https://www.nature.com/articles/sdata2018142
Hi Meagan,
1. Thanks. It is crucial information.
2. a) Were all the variants in VCFs files called against a reference human genome?
b).recalibrated_variants.annotated.clinical.txt are classified as 'All low frequency HIGH/MODERATE annotated variants with possible clinical impact (from ClinVar ) in text file format'. What is meant by 'low frequency' ie what is the cut-off and relative to what?
3. I am still not able to link each row (ie each 'mutation) in vcf files to specific samples. The file you refer to has two columns, WGS_id and projid - none of which are contained in the vcf files.
Thanks,
Mia
Hi Mia,
I have a partial update for you:
1. I have an email out to our collaborators about this but haven't heard back yet.
2. As far as we know there is not a file reporting all variants that could not be classified as germline.
3. The sample identifiers in the VCF files link back to the WGS_id column in AMP-AD_rosmap_WGS_id_key.csv (syn11384589).
Sage has a site closure next week (8/5-8/9), but I'll keep working on an answer to your first question and be back in touch with you the week of August 12 with an update. Please let me know if you have any additional questions between now and then.
Thanks!
Meagan Hi Mia,
I'm looking into your questions and will get back to you with an update next week.
Thanks,
Meagan
Hi,
Kind reminder, can you please let me know about above or connect me to somebody who could please help?
Thanks
Mia Sorry, all three questions above for ROSMAP please ! Thanks! There is WGS data from 3 different studies: MayoRNAseq, MSBB, and ROSMAP. Are you looking for information from all 3, or a specific one