Home

Awesome

GDC Spreadsheet Download Tool

Quick Start:

  1. Download manifest from https://portal.gdc.cancer.gov/
  2. python gdc-tsv-tool.py <manifest_file>

The GDC Spreadsheet Download Tool will download clinical and/or biospecimen metadata for a given set of files in a tab-delimited format. These file sets can be passed to the tool in a manifest downloaded from the GDC Portal (https://portal.gdc.cancer.gov/) or in a plain text list of file UUIDs. The tab delimited output is compatible with Microsoft Excel or any other spreadsheet program.

The GDC Spreadsheet Download Tool produces TSVs in which each row represents one file and each column represents a clinical or biospecimen field. Because of the structure of the GDC Data Model, files can be associated with more than one of each field (e.g. a VCF associated with a tumor sample and a normal sample), which produces more than one column. This tool divides the TSV into smaller TSVs of equal column number.

Usage: python gdc-tsv-tool.py [options] <manifest_file>

Options:

Notes: