broad.mit.edu_STAD_Genome_Wide_SNP_6.hg18

Created By Kyle Ellrott kellrott
BROAD TCGA ALGORITHM DESCRIPTION For the latest description of the algorithms, please see the supplementary information of the paper at: http://www.nature.com/nature/journal/vaop/ncurrent/suppinfo/nature07385.html Invariant Set Median-Polish Values Protocol Name: broad.mit.edu:invariantset_medianpolish:Genome_Wide_SNP_6:01 Link: http://www.broad.mit.edu/cancer/software/genepattern/ Data Level: 2 Data File: *.ismpolish.data.txt Invariant Set Median-Polish results are probe sets' normalized intensity values. Firstly, the probes' raw intensity values were brightness corrected using Invariant Set Normalization as described in Li and Wong et al.'s dChip paper. Then the probe sets were summarized using a robust median, a median-polishing method described in Bolstad et al.'s RMA paper. Both of these steps were executed by a GenePattern module called SNPFileCreator. Allele-Specific Copy-Numbers Protocol Name: broad.mit.edu:byallele_copynumber:Genome_Wide_SNP_6:01 Link: http://www.broad.mit.edu/cancer/software/genepattern/ Data Level: 2 Data File: *.byallele.copynumber.data.txt Allele-specific copy numbers were estimated at each of the SNP markers by subtracting a background term and dividing by a scaling factor. The calculation is done in an allele-specific manner. The background term for each allele is estimated using the center of the birdseed cluster associated with homozygous call of the other allele (for example, for allele A we use the A coordinate of the center of the BB cluster). The scaling factor is set to half the of the distance between the AA cluster and the BB cluster along the relevant coordinate. Copy-Numbers Protocol Name: broad.mit.edu:raw_copynumber:Genome_Wide_SNP_6:01 Link: http://www.broad.mit.edu/cancer/software/genepattern/ Data Level: 2 Data File: *.raw.copynumber.data.txt Raw copy numbers were estimated at each of the SNP and copy-number (CN) markers by subtracting a background term and dividing by a scaling factor. The total copy at SNP markers was calculated by summing the allele-specific values. For CN probes we built a model based on an X-dosage experiment which estimates the background and scaling factor as a function of the median intensity of the probe across normal samples. Finally, we divide the total copy number by the average of all normals and multiply by 2. The value is in linear space. Tangent Copy-Numbers Protocol Name: broad.mit.edu:tangent_copynumber:Genome_Wide_SNP_6:01 Link: http://www.ncbi.nlm.nih.gov/geosuppl/?acc=GSE19399&file=GSE19399%5Ftangent%5Fnorm%5Ffiles%2Etar%2Egz Data Level: 2 Data File: *.tangent.copynumber.data.txt The total copynumber is smoothed by first removing outliers, and then applying the tangent algorithm. The value is in linear space, and each sample has been centered around 2, as if it were diploid. Tangent normalization determines normalized copy number values by calculating the orthogonal distance between each data (in this case, cancer) sample and the high-dimensional hyperplane defined by a set of reference samples. The projection of each sample onto this hyperplane is equivalent to constructing a 'hypothetical' reference sample that most closely approximates that data sample; by normalizing against this 'hypothetical' sample, we normalize away as much of the variance in signal intensity observed in that sample as can be explained by linear combination of signal intensity in the reference set, yielding cleaner copy number values. Segmentation Protocol Name: broad.mit.edu:segmented_scna_hg18:Genome_Wide_SNP_6:01, broad.mit.edu:segmented_scna_hg19:Genome_Wide_SNP_6:01, broad.mit.edu:segmented_scna_minus_germline_cnv_hg18:Genome_Wide_SNP_6:01, broad.mit.edu:segmented_scna_minus_germline_cnv_hg19:Genome_Wide_SNP_6:01 Link: http://www.broad.mit.edu/cancer/software/genepattern/ Data Level: 3 Data File: *.hg18.seg.txt, *.hg19.seg.txt, *.nocnv_hg18.seg.txt, *.nocnv_hg19.seg.txt The probes are sorted according to genome build, and then segmented using the CBS (Circular Binary Segmentation) algorithm. Those without CNVs have also had a fixed set of probes removed prior to segmentation, and are more suitable for GISTIC analysis. The value is base2 log(copynumber/2), centered on 0. Birdseed Genotypes Protocol Name: broad.mit.edu:birdseed_genotype:Genome_Wide_SNP_6:01 Link: https://www.affymetrix.com/support/developer/powertools/index.affx Data Level: 2 Data File: *.birdseed.data.txt Birdseed results are genotype calls produced by the Birdseed algorithm from the probe sets' intensity values normalized by Invariant Set Median-Polish algorithm. Initially the normalized values of SNP probe sets from the normals samples were passed as input to birdseed along with the 6.0 priors file and special SNPs file. The clusters, confidences and calls files were generated. The Birdseed was run again this time using the '--clusters' option and using the SNP probe sets from all samples with the clusters file from the previous normals run.

acronym: STAD
disease: cancer
species: Homo sapiens
platform: Genome_Wide_SNP_6
lastUpdate: 2012-05-01
tissueType: stomach