Hello, I have aligned your mutiome snRNAseq MTG data. I am getting ~35% mapping. That is lower than I expected. What percentage mapping alignment did you get? Wondering if I did something wrong? THANKS

Created by Ashay Patel aopatel
We have all the cellranger/cellranger-arc/cellranger-atac outputs here on Synapse. Depending on what output you want you can filter by filetype on the AD Knowledge portal.
@ktravaglini do you already have previously processed and aligned multiome data?
I would recommend using cellranger-arc (we used v2.0.0, per the except from the manuscript I noted above).
@ktravaglini If the data that I want to look at in the multiome folder is strictly snRNA-seq, is it appropriate to coerce the data using regular cell ranger count? This is the message I get ``` Log message: Cell Ranger detected the chemistry ARC-v1, which may indicate a workflow error during sample preparation. Please check the reagents used to prepare this sample and contact 10x Genomics support for further assistance. If this workflow is intentional, you can force Cell Ranger to process this data by manually specifying the ARC-v1 chemistry in your analysis configuration. ```
That is not snRNAseq data, that's snMultiome data and should be aligned with cellranger-arc. -Kyle
@ktravaglini I am trying to align the MTG snRNAseq data from the multiome folder
No, cellranger v6.1.1 was used for the snRNAseq alignments (chemistry is 10x 3' v3.1). What file are you trying to align?
@ktravaglini Was cellranger-arc used for the snRNA-seq as well? Because when I try to use cell ranger count its giving me an error and saying arc chemistry is detected?
@aopatel From the manuscript: Reads from snRNA-seq libraries were mapped to 10x Genomics’ official human reference (“Human reference (GRCh38) – 2020-A”) and unique molecular identifiers (UMIs) counted per gene using the cellranger (version 6.1.1) pipeline with the “—include—introns” parameter included. Reads from snATAC-seq and snMultiome libraries were mapped to the same reference using cellranger-atac (version 2.0.0) and cellranger-arc (version 2.0.0) pipelines with default parameters, respectively. 10x has release notes on 2020-A (https://www.10xgenomics.com/support/software/cell-ranger/latest/release-notes/cr-reference-release-notes#2020-a)
@ktravaglini Which genome did you use?
Hi @aopatel, The vast majority of the libraries had >80% of their reads mapped uniquely to the genome with cellranger/STAR. A small subset of libraries from severely affected donors (described in Figure 1 of our manuscript) had alignment rates between 40-80% and 2 libraries from 2 donors that we failed (due to low RIN) were below 40%. I suspect the issue is single **nucleus** RNAseq will have less spliced transcriptomic reads (and I know kallisto builds a **transcriptomic** versus **genomic** reference). You will probably need to include introns in your transcriptomic reference. If you already have (I believe that's what nascent.txt is?), then it may be an issue with the source of the "nascent" transcripts itself (e.g. if it was built from non-brain tissue it may be missing nascent transcripts that are found exclusively in the brain). Best, Kyle
Yeah I wonder what I am doing wrong, I assume your tech for multiome is most similar to 10XV3 out of these: ``` name description on-list barcode umi cDNA ------------ ----------------------------------- ------- ----------------------- ------- ----------------------- 10XV1 10x version 1 yes 0,0,14 1,0,10 2,None,None 10XV2 10x version 2 yes 0,0,16 0,16,26 1,None,None 10XV3 10x version 3 yes 0,0,16 0,16,28 1,None,None 10XV3_ULTIMA 10x version 3 sequenced with Ultima yes 0,22,38 0,38,50 0,62,None BDWTA BD Rhapsody yes 0,0,9 0,21,30 0,43,52 0,52,60 1,None,None BULK Bulk (single or paired) 0,None,None 1,None,None CELSEQ CEL-Seq 0,0,8 0,8,12 1,None,None CELSEQ2 CEL-SEQ version 2 0,6,12 0,0,6 1,None,None DROPSEQ DropSeq 0,0,12 0,12,20 1,None,None INDROPSV1 inDrops version 1 0,0,11 0,30,38 0,42,48 1,None,None INDROPSV2 inDrops version 2 1,0,11 1,30,38 1,42,48 0,None,None INDROPSV3 inDrops version 3 yes 0,0,8 1,0,8 1,8,14 2,None,None SCRUBSEQ SCRB-Seq 0,0,6 0,6,16 1,None,None SMARTSEQ2 Smart-seq2 (single or paired) 0,None,None 1,None,None SMARTSEQ3 Smart-seq3 0,11,19 0,11,None 1,None,None SPLIT-SEQ SPLiT-seq 1,10,18 1,48,56 1,78,86 1,0,10 0,None,None STORMSEQ STORM-seq 1,0,8 0,None,None 1,14,None SURECELL SureCell for ddSEQ 0,0,6 0,21,27 0,42,48 0,51,59 1,None,None Visium 10x Visium yes 0,0,16 0,16,28 1,None,None ``` And I just trimmed using trim_galore: ``` trim_galore --quality 20 --fastqc --illumina --cores 9 --paired "$r1_file" "$r2_file" ```
Ah. I misunderstood what you meant by mapping. I'm pretty sure that our data had substantially higher mapping when we aligned to the genome, but I'll check and get back to you as soon as I hear back.
I used the Kallisto aligner with this command on your multiome-MTG snRNA-seq data that was previously trimmed with trim_galore: ``` kb count -x 10XV3 --workflow=nac -o output.dir -i index.idx -g t2g.txt -c1 cdna.txt -c2 nascent.txt --sum=total --batch-barcodes batch.txt --verbose --strand=unstranded ```
Hi @aopatel, Based on what you wrote here, I'd guess that either your data is low quality, or there is something wrong with one of the data/file formatting steps between cell x gene matrix and mapping results, but would need more information to know for sure. Did you use MapMyCells for this or map the data in a different way? If the former, can you please provide the ID associated with your run? It will look something like this: 1720204204941-b151e3ce-4dfe-446e-b965-5633009d277f. This would allow folks on our end to debug, if you are okay with that. Best, Jeremy
@aopatel I will contact some of the computational SEA-AD scientists to provide some guidance.
@eitan.kaplan Yes, please this would be helpful. Thank you,
Hi @aopatel, I'm not sure but we can ask the team that contributed the data. @eitan.kaplan, would you be able to help answer these questions about mapping for the SEA-AD data? Thanks!
Additionally the duplicates percentage is >50% for most samples?

.sg-noscript { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif; max-width: 860px; margin: 40px auto; padding: 0 24px; color: #141414; line-height: 1.6; } .sg-noscript h1 { font-size: 1.8rem; margin-bottom: 0.25rem; } .sg-noscript h2 { font-size: 1.2rem; margin-top: 2rem; margin-bottom: 0.5rem; border-bottom: 1px solid #e0e0e0; padding-bottom: 0.25rem; } .sg-noscript ul { padding-left: 1.5rem; } .sg-noscript li { margin-bottom: 0.4rem; } .sg-noscript a { color: #1a6fa8; } .sg-noscript address { font-style: normal; } .sg-noscript .note { margin-top: 2rem; color: #666; font-size: 0.85rem; }

Synapse — A Collaborative Platform for Open Biomedical Science

Synapse is a collaborative data-sharing and analysis platform built and operated by Sage Bionetworks, a 501(c)(3) nonprofit biomedical research organization based in Seattle, Washington.

About Sage Bionetworks

Sage Bionetworks is a nonprofit research organization whose mission is to drive a new age of discovery through truly open science and radical collaboration.

Our vision is to create a world where silos within and across science and technology no longer exist, forging a path to optimal human health.

We are a trusted leader in data sharing and reuse, enabling a rapid acceleration in biomedical discoveries and the transformation of medicine. Better Science Together is the principle that guides our work with researchers, clinicians, patient communities, and funders worldwide.

What Synapse Does

Synapse is the platform Sage Bionetworks uses to make biomedical research data findable, accessible, interoperable, and reusable (FAIR). Researchers, clinicians, and data scientists use Synapse to:

Share large biomedical datasets across institutions, with appropriate access controls, data-use agreements, and governance.
Run reproducible analyses on shared data with documented provenance.
Coordinate consortium science across disease areas including Alzheimer's disease, neurofibromatosis, ALS, rare cancers, and others.
Power public-facing knowledge portals such as the AD Knowledge Portal, the NF Data Portal, and the ALS Knowledge Portal.

Nonprofit Identity

Sage Bionetworks
A 501(c)(3) nonprofit research organization
EIN: 26-4489946
Seattle, Washington, USA
sagebionetworks.org
Trust Center — Terms of Service, Privacy Policy, financial statements, and governance documents

Learn More

This static content is provided for search engines and users with JavaScript disabled. For the full Synapse experience, please enable JavaScript in your browser.

Hello, I have aligned your mutiome snRNAseq MTG data. I am getting ~35% mapping. That is lower than I expected. What percentage mapping alignment did you get? Wondering if I did something wrong? THANKS

Drop files to upload

Questions regarding alignment for SEA-AD multiome GEX data page is loading…