Hello, I am analyzing ROSMAP blood RNA-seq data (syn22024498). The batch3 fastq files have two end reads with three different lanes. I have got the counts of each lane through featurecounts. But I don't know how to deal with the counts of the same sample but different lanes. Is it necessary to combine them or treat them as separate samples? Sample_001_TAGCATAACC-GCTATGCGCA_HK735DMXX_L001_001.R1.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HK735DMXX_L001_001.R2.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HK735DMXX_L002_001.R1.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HK735DMXX_L002_001.R2.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HN3M2DMXX_L001_001.R1.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HN3M2DMXX_L001_001.R2.fastq.gz Thanks, DuanTingting

Created by Tingting Duan Duantingting
Hi Dana, I'm not sure but hopefully Dr. Yiyi Ma at Columbia can help answer! @yiyima, can you help advise on the correct order for the fastq merge process for the ROSMAP monocyte RNAseq data? Thanks, Abby
Hi @abby.vanderlinden Is it possible to get a short explanation about the correct order of the reads of the monocyte samples in batch 3? I understand that all the Sample_XXX_...._xxx.R1 belongs to the same endread. But how can I know between the 3 files with the same R1 postfix. What is the internal order? What would be the correct file sequence for the merge process in the following example? Sample_001_TAGCATAACC-GCTATGCGCA_HK735DMXX_L001_001.R1.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HK735DMXX_L001_001.R2.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HK735DMXX_L002_001.R1.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HK735DMXX_L002_001.R2.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HN3M2DMXX_L001_001.R1.fastq.gz Sample_001_TAGCATAACC-GCTATGCGCA_HN3M2DMXX_L001_001.R2.fastq.gz Thanks Dana
The original merged fastq files that were replaced were merged incorrectly -- the order of the reads in the provided files was incorrect. Best practice is still to merge the files before alignment, but we wanted to provide the raw files so that downstream users can start with the raw data and do the merging with their preferred method.
@abby.vanderlinden Thanks! However, the data of batch 3 was merged before, but it was later found that this was wrong. So the re uploaded data is not merged. [Corrections to batch 3 fastq files in February 2022: Samples from batch 3 were sequenced across 1-3 lanes depending on the sample, and fastqs from those sequenced across multiple lanes were initially merged and provided as one file per read end. However, it was discovered that files were incorrectly merged, resulting in incorrect read sequences and incompatible transcript counts. The original merged fastqs provided from batch 3 samples have been deprecated and replaced with the raw unmerged fastq files generated from these samples.](https://www.synapse.org/#!Synapse:syn22024496)
Hi there, I would recommend merging the fastq files that belong to the same sample before doing the alignment and generating the counts.

.sg-noscript { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif; max-width: 860px; margin: 40px auto; padding: 0 24px; color: #141414; line-height: 1.6; } .sg-noscript h1 { font-size: 1.8rem; margin-bottom: 0.25rem; } .sg-noscript h2 { font-size: 1.2rem; margin-top: 2rem; margin-bottom: 0.5rem; border-bottom: 1px solid #e0e0e0; padding-bottom: 0.25rem; } .sg-noscript ul { padding-left: 1.5rem; } .sg-noscript li { margin-bottom: 0.4rem; } .sg-noscript a { color: #1a6fa8; } .sg-noscript address { font-style: normal; } .sg-noscript .note { margin-top: 2rem; color: #666; font-size: 0.85rem; }

Synapse — A Collaborative Platform for Open Biomedical Science

Synapse is a collaborative data-sharing and analysis platform built and operated by Sage Bionetworks, a 501(c)(3) nonprofit biomedical research organization based in Seattle, Washington.

About Sage Bionetworks

Sage Bionetworks is a nonprofit research organization whose mission is to drive a new age of discovery through truly open science and radical collaboration.

Our vision is to create a world where silos within and across science and technology no longer exist, forging a path to optimal human health.

We are a trusted leader in data sharing and reuse, enabling a rapid acceleration in biomedical discoveries and the transformation of medicine. Better Science Together is the principle that guides our work with researchers, clinicians, patient communities, and funders worldwide.

What Synapse Does

Synapse is the platform Sage Bionetworks uses to make biomedical research data findable, accessible, interoperable, and reusable (FAIR). Researchers, clinicians, and data scientists use Synapse to:

Share large biomedical datasets across institutions, with appropriate access controls, data-use agreements, and governance.
Run reproducible analyses on shared data with documented provenance.
Coordinate consortium science across disease areas including Alzheimer's disease, neurofibromatosis, ALS, rare cancers, and others.
Power public-facing knowledge portals such as the AD Knowledge Portal, the NF Data Portal, and the ALS Knowledge Portal.

Nonprofit Identity

Sage Bionetworks
A 501(c)(3) nonprofit research organization
EIN: 26-4489946
Seattle, Washington, USA
sagebionetworks.org
Trust Center — Terms of Service, Privacy Policy, financial statements, and governance documents

Learn More

This static content is provided for search engines and users with JavaScript disabled. For the full Synapse experience, please enable JavaScript in your browser.

Drop files to upload

ROSMAP blood RNA-seq data (syn22024498) page is loading…