Dear team,
I am reposting the request below as a new thread.
I am having trouble mapping the data between the different files. For example, according to the manifest M1TX_191204_103_F01.csv bio specimen file maps to the data matrix file 1001246846-raw_feature_bc_matrix.h5. However, The number of cells in the CSV file is different from the number of cells in the h5 file. The Seurat object created from the h5 file has significantly more cells (~2x). What am I doing wrong? How do I match the cell names in the h5 file to the cell metadata information. The synapse ID for the two files mentioned above. The synapse ID is syn30961086 for 1001246846-raw_feature_bc_matrix.h5 and the synapse ID is syn30961158 for M1TX_191204_103_F01.csv.
Thanks!
Created by rrajesh Dear Rich,
Apologize for the delayed response. Thanks a lot for replicating my steps and your suggestions. I understand your explanation. Thanks!
Best regards,
Rajesh Thank you for posting your code. I was able to replicate your steps (see below) and verify the number of cells in the Seurat object created from the matrix file.
```
# 1001246846-raw_feature_bc_matrix.h5
syn30961086 <- synGet(entity='syn30961086', downloadLocation = tmp)
# Create a Seurat object from the 10X Genomics h5 file
h5_file <- Seurat::Read10X_h5(filename = syn30961086$path,
use.names = TRUE,
unique.features = TRUE)
seurat_object <- Seurat::CreateSeuratObject(counts = h5_file, min.cells = 3, min.features = 200)
seurat_object # 31,253 genes x 14,923 cells
# An object of class Seurat
# 31253 features across 14923 samples within 1 assay
# Active assay: RNA (31253 features, 0 variable features)
# M1TX_191204_103_F01.csv
syn30961158 <- synGet(entity='syn30961158', downloadLocation = tmp)
cellmeta <- read.csv(file = syn30961158$path, header = TRUE)
cellmeta %>% nrow # 7416
```
The cell counts would be influenced by the filters used in the overall Seurat pipeline. It would be useful to review any published details on this analysis. Let me know if you have any additional questions. If needed, we could request additional detail from the data contributors.
-Rich
Hi,
Thanks for your response. Please find the responses below.
1. The code you used to create the Seurat object
hdf5_obj <- Read10X_h5(filename = filelist[i,]$name,
use.names = TRUE,
unique.features = TRUE)
seurat_hdf5 <- CreateSeuratObject(counts = hdf5_obj,min.cells = 3,min.features = 200)
cellmeta <- read.csv(file = paste0(filelist$specimenID[i],".csv"),header = TRUE)
For example, if the h5 object filename = 1001246846-raw_feature_bc_matrix.h5 and the corresponding specimens ID file is M1TX_191204_103_F01.csv.
2. The expected number of cells in the Seurat object
The number of cells in the specimens ID files is 7416 and the number of cells in the seurat object is 14923.
3. The actual number of cells in the Seurat object
I am not sure but I thought it would be consistent to have the same number of cells in both the seurat object and the specimenID file.
The cells in the specimenID file are a subset of the cells in the Seurat object. Also, the problem is not because of standard cell quality filters like feature count.
Thanks! Hi Rajesh,
Thanks for your query. Would you please share the following so I can help you troubleshoot your issue:
- The code you used to create the Seurat object
- The expected number of cells in the Seurat object
- The actual number of cells in the Seurat object
Thanks!