Hello,
I am interested in looking at the umapped fq files and it seems the format across the files is not standard nor as indicated in the annotation.
For example, syn4055334 is not a fastq file while syn5519739 is a fastq file?
When I attempt to convert syn4055334 from sam to bam via:
samtools view -bS BM_10_652.unmapped.fq > unmapped_BM_10_652.bam
I get both a parse error and a truncated file error?
What format is syn4055334 = BM_10_652.unmapped.fq?
Why is it some files are gzipped and others are not?
Can the files be organized under directory folders according to what format they actually are?
When I explore the STAR code provided at syn3219346, I see the parameter pass:
--outReadsUnmapped fastx
Need help and clarification on the above please.
Thank you.
Kory Johnson
Created by Kory Johnson johnsonko Hi @johnsonko ,
All the files have been renamed and re-annotated to reflect the correct file format.
Hope this helps! Thanks for investigating this! I downloaded that file it's a gzipped fastq file. I will file an issue to rename these files to make that clear and so when downloading they are usable in a standard way.
In terms of organization, all of these files are annotated (see https://docs.synapse.org/articles/annotation_and_query.html). This annotation includes `fileType`. We maintain a searchable table that includes these annotations for all files [here](https://www.synapse.org/#!Synapse:syn11346063/tables/). In your case, you might want fastq files from the MSBB study. This gives you that result in a fashion that you can download all of these files:
${synapsetable?query=SELECT %2A FROM syn11346063 WHERE %28 %28 "study" %3D %27MSBB%27 %29 AND %28 "assay" %3D %27rnaSeq%27 %29 AND %28 "fileFormat" %3D %27fastq%27 %29 %29&showquery=true}
Drop files to upload
HELP -> RNAseq umapped fq files ... page is loading…