Is there a replacement for GNOS files of listed in <some file Disscuss does not like in the title> with OxoG applied

Hi guys,

For some of my tests I was using some intermediate results that were hosted in GNOS. I’m not sure I’m on top of these changes, but it seems like I can’t use GNOS anymore for this. Are they available elsewhere? Before I had a file called

variant_call_entries_with_broad_oxog_filter_applied.txt

That had fields:

donor_unique_id
submitter_donor_id
dcc_project_code
tumor_aliquot_ids
gnos_repo
gnos_id
gnos_last_modified
vcf_workflow_result_version

For instance:

BLCA-US::096b4f32-10c1-4737-a0dd-cae04c54ee33
096b4f32-10c1-4737-a0dd-cae04c54ee33
BLCA-US
301d6ce3-4099-4c1d-8e50-c04b7ce91450
https://gtrepo-osdc-tcga.annailabs.com/
7e19048c-f6ea-11e5-8b75-ba53aaf623c4
2016-03-31T19:48:02+00:00
v1

They are currently used in the Consensus and Merge-Annotate workflows. The Merge-Annotate is being replaced by the OxoG filter one. But still, these intermediate results I think lay between the OxoG and the Consensus and are useful to test these in isolation.

Please can someone track these files for me and see if they are available elsewhere, if they still are.

I found this File: https://dcc.icgc.org/repositories/files/FI339385 which contains a reference to the tumour aliquot ID you referenced (301d6ce3-4099-4c1d-8e50-c04b7ce91450). If you search dcc.icgc.org, maybe you can find the files you need?

Hi Solomon. That file you send are for CNA not SNV. Also the files from ICGC I believe contain the original workflow results and the consensus calls, which I’m already using, but not the intermediate workflow results like after OxoG, which is what I need to test some of these workflows.

Hi Miguel,
The file I listed in my previous post was just an example. Your original post did not mention any specific files by name, so I just searched for the term “301d6ce3-4099-4c1d-8e50-c04b7ce91450” (the tumour aliquot ID you mentioned) and then linked to the first resulting file to demonstrate how it could be done. A list of all files containing that string is here: https://dcc.icgc.org/q?q=301d6ce3-4099-4c1d-8e50-c04b7ce91450

I am not sure that the Consensus-Calling tool used intermediate files. I thought that the input for Consensus-Calling was the output of OxoG. What is the exact name of the file you are looking for? Can it be found in ICGC?

Hi Solomon,

The file should contain the term ‘oxoG’ these are the results of running oxoG at some intermediate step between the workflow results and the final consensus, before or after bias-filtering I’m not sure. The ICGC seems to contain only the workflow results and the final consensus, but not these steps. I check this by downloading the fill PCAWG export from the REST API, for a particular donor I get these files:

Aligned Reads.Normal - blood derived.8d35619347952f08d81bcd78beb4e64e.bam
Aligned Reads.Primary tumour - solid tissue.f665fc5c67b645d3038770c49c02fb02.bam
CNSM.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.dkfz-copyNumberEstimation_1-0-189-hpc-fix.1508262003.somatic.cnv.vcf.gz
CNSM.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.svcp_1-0-4.20150212.somatic.cnv.vcf.gz
SGV.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.broad-snowman.20150918.germline.indel.vcf.gz
SGV.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.dkfz-indelCalling_1-0-132-1.20150611.germline.indel.vcf.gz
SGV.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.dkfz-snvCalling_1-0-132-1.20150611.germline.snv_mnv.vcf.gz
SSM.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.broad-mutect-v3.20160222.somatic.snv_mnv.vcf.gz
SSM.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.broad-snowman.20150918.somatic.indel.vcf.gz
SSM.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.consensus.20160830.somatic.snv_mnv.vcf.gz
SSM.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.consensus.20161006.somatic.indel.vcf.gz
SSM.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.dkfz-indelCalling_1-0-132-1.20150611.somatic.indel.vcf.gz
SSM.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.dkfz-snvCalling_1-0-132-1.20150611.somatic.snv_mnv.vcf.gz
SSM.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.MUSE_1-0rc-vcf.20150918.somatic.snv_mnv.vcf.gz
SSM.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.svcp_1-0-4.20150212.somatic.indel.vcf.gz
SSM.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.svcp_1-0-4.20150212.somatic.snv_mnv.vcf.gz
StGV.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.broad-snowman.20150918.germline.sv.vcf.gz
StGV.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.embl-delly_1-0-0-preFilter.20150611.germline.sv.vcf.gz
StSM.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.broad-dRanger.20150918.somatic.sv.vcf.gz
StSM.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.broad-dRanger_snowman.20150918.somatic.sv.vcf.gz
StSM.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.broad-snowman.20150918.somatic.sv.vcf.gz
StSM.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.embl-delly_1-0-0-preFilter.20150611.somatic.sv.vcf.gz
StSM.Primary tumour - solid tissue.fc876f66-ff0d-f338-e040-11ac0d485162.svfix2_4-0-12.20160213.somatic.sv.vcf.gz

As you can see these are the BAM files, the workflow results and the consensus. The files I need where available in GNOS and listed in the file:

variant_call_entries_with_broad_oxog_filter_applied.txt

Without them I cannot test the results of the OxoG filter, since I need this as a comparison, or the Consensus, since I need them as input.

Best

Miguel

Ah, I think I misunderstood. I didn’t consider the OxoG output files to be “intermediate” files.

Ok, I can now confirm that the OxoG VCF files are in GNOS. There are no plans that I am aware of to move them to the main portal so you will need GNOS access to get the files.

Hi Miguel,

The latest OxoG filtered result in GNOS is listed here: http://pancancer.info/gnos_metadata/latest/reports/QC_reports/variant_call_entries_with_broad_oxog_filter_applied.txt

For Donor: DO52739, the corresponding GNOS entry is: https://gtrepo-osdc-icgc.annailabs.com/cghub/metadata/analysisFull/1e2c518c-13cd-11e6-bd39-9b5c9a19caef

If you are using other donors, you can find its associated OxoG entry by project code and donor submitter ID.

Hope this helps!

Junjun