TCGAbiolinks has provided a few functions to download mutation data from GDC. There are two options to download the data:
GDCquery
, GDCdownload
and GDCpreprare
to download MAF aligned against hg38GDCquery
, GDCdownload
and GDCpreprare
to download MAF aligned against hg19getMC3MAF()
, to download MC3 MAF from https://gdc.cancer.gov/about-data/publications/mc3-2017This example will download Aggregate GDC MAFs. For more information please access https://github.com/NCI-GDC/gdc-maf-tool and GDC docs.
This will download the MC3 MAF file from https://gdc.cancer.gov/about-data/publications/mc3-2017, and add project each sample belongs.
To visualize the data you can use the Bioconductor package maftools. For more information, please check its vignette.
library(maftools)
library(dplyr)
query <- GDCquery(
project = "TCGA-CHOL",
data.category = "Simple Nucleotide Variation",
access = "open",
data.type = "Masked Somatic Mutation",
workflow.type = "Aliquot Ensemble Somatic Variant Merging and Masking"
)
GDCdownload(query)
maf <- GDCprepare(query)
maf <- maf %>% maftools::read.maf