Hi BPC Team,
We have a question regarding the relationship between the cancer_level_dataset_index and data_timeline_cancer_diagnosis tables in the BPC cohorts. We were unable to find any references to the data_timeline_cancer_diagnosis table in the provided data guide, and were wondering if this table is a derived table from cancer_level_dataset_index, or whether it is sourced independently. Additionally, there are fields in data_timeline_cancer_diagnosis tables such as index_cancer which are not present in the cancer_level_dataset_index table and were unable to find any data definitions regarding this column. Do you have any additional information regarding the origin of the data_timeline table, and do you have a recommendation on which of these two tables might best be used to derive patient diagnosis information?
Thank you so much!
Created by maayanbaron Dear @maayanbaron,
This is a great question. Some of the release files are explicitly for import into the cBioPortal (see v16.0-public on cBioPortal [here](https://genie.cbioportal.org/)). cBioPortal is an open-access, open-source resource for interactive exploration of multidimensional cancer genomics data sets. These cBioPortal files will also contain the genomic information. The other release files are specifically created for clinical analyses and will contain more clinical variables. The cBioPortal files and the "clinical_data" files are separated on Synapse [here](https://www.synapse.org/Synapse:syn27056179).
So using your example, cancer_level_dataset_index.csv should be used for clinical analyses and data_timeline_cancer_diagnosis.txt is used for cBioPortal.
Some points:
- All files within each release are sourced from the same origin. However some variables used to generate the clinical files and cBioPortal files may be released in one but not the other.
- There is a variable crosswalk available for the NSCLC 2.0-public release [here](https://www.synapse.org/Synapse:syn29288708) between the cBioPortal files and the "clinical_data".
- More information on the columns and their descriptions required for cBioPortal import can be found [here](https://docs.cbioportal.org/file-formats/#timeline-data)
- The distinction of which release file is in a cBioPortal file format is not as clear in the Portal as it is on [Synapse](https://www.synapse.org/Synapse:syn27056179). I will add more documentation to help with this distinction.
Let me know if this helps.
Drop files to upload
Clarification Needed on Data Relationships in BPC Cohorts page is loading…