Data Sources V180819

 

Tissue Type
Database Normal Adjacent Primary Recurrent Metastatic Sample Size
St Jude’s 0 0 66 0 0 66
GTEx 7412 0 0 0 0 7412
MET500 0 0 0 0 387 387
TARGET 0 11 602 120 1 734
TCGA 0 726 9366 44 393 10529
TOTAL 7412 737 10034 164 781 19128

 

 

GTEx (Genotype-Tissue Expression) n= 7412 (2741 F, 4671 M)

TARGET (Therapeutically Applicable Research to Generate Effective Treatments) n= 734 (359 F, 375 M)

TCGA (The Cancer Genome Atlas) n= 10529 (5282 F, 5247 M)

  • For info, visit  https://cancergenome.nih.gov/
  • For stats, visit  https://portal.gdc.cancer.gov/exploration
  • TCGA is a collaboration between National Cancer Institute(NCI) and National Human genome research Institute (NHGRI) involving 20 institutions from the US and Canada to collect adjacent and cancerous tissues.
  • Sample IDs are headed with “TCGA”
  • Sample collection ended in 2013 after obtaining 20,000 tissues of 33 types of cancer from over 11,000 patients.  
  • We filtered for data with complete phenotype metadata e.g. tissue origin or cancer type.

MET 500 n= 387 (187 F, 200 M)

  • For info, visit  https://www.nature.com/articles/nature23306.pdf
  • Dan R. Robinson et al. Integrative Clinical Genomics of Metastatic Cancer.  Nature. 2017. 548:297-303.
  • Sample IDs headed by “SRR”
  • Metastatic cancer samples from 500 adult patients
  • This data contains 101 unique cancers. The top 2 metastatic cancer were prostate adenocarcinoma and breast invasive carcinoma.
  • Counts were computed using UC Santa Cruz TOIL.
  • The data had both polyA and Hybrid RNASeq runs. In order to ensure consistent comparison we only retained the polyA counts.

St. Jude’s Hospital n= 66 (29 F, 37 M)

  • This set of data of data are High Grade Glioma samples containing 22 samples of Diffuse Intrinsic Pontine Glioma and 44 samples of Non-Brainstem High Grade Glioma
  • Sample IDs are headed by “SJHGG”
  • Counts were computed using UC Santa Cruz TOIL.