TCGA Kidney Cancers
About
The TCGA Kidney Cancers Dataset is a bulk RNA-seq dataset that contains transcriptome profiles of patients diagnosed with three different subtypes of kidney cancers. This dataset can be used to make predictions about the specific subtype of kidney cancers given the normalized transcriptome profile data, as well as providing a hands-on experience on large and sparse genomic information.
Preprocessing description:
Fragments Per Kilo Million (FPKM) normalization.
Does this dataset contain sensitive information?:
This dataset contains the variables age, race, and ethnicity.
Variables Info:
Bulk RNA-Seq normalized using FPKM (fragments per kilo million) method
Class labels:
- TCGA-KICH
- TCGA-KIRC
- TCGA-KIRP
Subject Area
Health and Medicine
Instances
1,024
Features
60,660
Data Types
Tabular, Multivariate
Tasks
Classification, Clustering
Feature Types
Continuous
Features
–
Introductory Paper
The Cancer Genome Atlas Pan-Cancer analysis project
J. Weinstein, E. Collisson, G. Mills, K. Shaw, B. Ozenberger, K. Ellrott, I. Shmulevich, C. Sander, Joshua M. Stuart. 2013.
Nature Genetics