Hi. I am using the ROS.MAP methylation array data, it is supposed to be 415,848 cg and 708 samples as described, however, the data provided by synapse, there are totally 420,132 cg and 740 samples. Do you have any idea why they are different? Any of your help would be very appreciated.
Synapse ID: syn3168763
Created by Peipei Li lipeipei Hi,
I see similar trend with PCA on "ROSMAP_arrayMethylation_imputed.tsv.gz". Can you please provide some details on how the samples were corrected for age, gender and experimental batch - can you please point me to the technical documentation.
Thanks
Sharvari Hello,
Was there a resolution to the above question?
I would also like to confirm which dataset is the normalized dataset of methylation values as in the above mentioned paper it says that the batches were controled for but doing a quick PCA of the file "ROSMAP_arrayMethylation_imputed.tsv.gz" shows that the batch effect is still present. Maybe I am using the wrong file? Which file is the normalized methylation matrix please?
Thank you.
Ashley Actually I followed the paper in De Jager et. al. 2014, even after every quality control step, you cannot get the matched cg numbers. Do these data uploaded here have a different quality control step with that paper?
Thank you!
Peipei
Hi Peipei,
My guess is that there was some quality control filtering on sites and samples. @leiyu is this correct?
Thanks,
Ben