UC Irvine
ML Repository
Theme

TCGA Kidney Cancers

View External

About

The TCGA Kidney Cancers Dataset is a bulk RNA-seq dataset that contains transcriptome profiles of patients diagnosed with three different subtypes of kidney cancers. This dataset can be used to make predictions about the specific subtype of kidney cancers given the normalized transcriptome profile data, as well as providing a hands-on experience on large and sparse genomic information. Preprocessing description: Fragments Per Kilo Million (FPKM) normalization. Does this dataset contain sensitive information?: This dataset contains the variables age, race, and ethnicity. Variables Info: Bulk RNA-Seq normalized using FPKM (fragments per kilo million) method Class labels: - TCGA-KICH - TCGA-KIRC - TCGA-KIRP
Subject Area
Health and Medicine
Instances
1,024
Features
60,660
Data Types
Tabular, Multivariate
Tasks
Classification, Clustering
Feature Types
Continuous

Features

Introductory Paper

The Cancer Genome Atlas Pan-Cancer analysis project
J. Weinstein, E. Collisson, G. Mills, K. Shaw, B. Ozenberger, K. Ellrott, I. Shmulevich, C. Sander, Joshua M. Stuart. 2013.
Nature Genetics

Additional Metadata

Authors
J. Weinstein
E. Collisson
G. Mills
K. Shaw
B. Ozenberger
K. Ellrott
I. Shmulevich
C. Sander
Joshua M.
Year Created
2013
License
CC BY 4.0