Synapse ID: syn3157325.
Why there are nearly no ChrX SNPs on 582 samples whose names start with "KronosII"?
According to the paper[1], "subjects are genotyped on the Affymetrix Genechip 6.0 platform at either the Broad Institute?s Center for Genotyping (n=1,204) or the Translational Genomics Research Institute (n=674). These two sets of data underwent the same quality control (QC) analysis in parallel, and genotypes were pooled."
Although there is no description like sample names about how to separate two data sets, we can still infer the two sets of data based on the SNPs missing rate and sample names.
${imageLink?synapseId=syn7537499&align=None&responsive=true}
So, my guess is samples with name start with "KronosII" are from Translational Genomics Research Institute, and other samples are from Broad. Correct me please if I am wrong.
If we separate the genotype data(actually the bim file) into two bim files based on their source, and then clean up the SNPs with 0% call rate, and then make a statistics on SNPs numbers on each chromosome. I found that only 4 chrX SNPs exist in one group, while there are ~27K chrX SNPs on the other group. So my question is why there is such a significant SNPs number variance on X chromosome between the two center since they use the same genechip?
${imageLink?synapseId=syn7537514&align=None&responsive=true}
Any answer would be appreciated! ~Tao
[1]De Jager et. al. Neurobiol Aging. 2012 May;33(5):1017.e1-15.
Created by Tao Wang twang Thanks Solly! That's what I'm looking for. Hi Tao-
This is exactly the same issue you asked about last week. I looked through my emails and found this official response to the question when I asked it 6 months ago:
> From: Chibnik, Lori B.,Ph.D.
>
> Hi Solly,
> The ROSMAP genotyping on the Affy chip (n\~1709) was done at 2 sites, 66% at the Broad and 34% at TGen. The Broad genotyping had a significantly better call rate that TGen did with ~100,000 more SNPs in the TGen batch failing QC than the Broad batch. We QCed the 2 batches separate with the Broad ending with 749K SNPs and TGen with 653K SNPs. (644K overlapping between the 2).
>
> That is the pattern of missingness we expect in the genotyping data. There will be differences across SNPs and individuals.
>
> Best,
> Lori
Drop files to upload
ROSMAP Genotype: SNPs number variance on X Chromosome page is loading…