TimeScape is a visualization tool for temporal clonal evolution.
To install TimeScape, type the following commands in R:
Run the examples by:
The following visualizations will appear in your browser (optimized for Chrome):
The first visualization is of the acute myeloid leukemia patient from Ding et al., 2012. The second visualization is of the metastatic ovarian cancer patient 7 from McPherson and Roth et al., 2016.
The required parameters for TimeScape are as follows:
\(clonal\_prev\) is a data frame consisting of clonal prevalences for each clone at each time point. The columns in this data frame are:
\(tree\_edges\) is a data frame describing the edges of a rooted clonal phylogeny. The columns in this data frame are:
\(mutations\) is a data frame consisting of the mutations originating in each clone. The columns in this data frame are:
If this parameter is provided, a mutation table will appear at the bottom of the view.
Clone colours may be changed using the \(clone\_colours\) parameter, for instance, compare the default colours :
with specified custom colours:
The alpha value of each colour may be tweaked in the \(alpha\) parameter (a numeric value between [0, 100]). Compare alpha of 10:
with the alpha value of 90:
The x-axis, y-axis and phylogeny titles may be changed using the \(xaxis\_title\), \(yaxis\_title\) and \(phylogeny\_title\) parameters, which take in a character string.
Here are some custom titles:
The position of each genotype with respect to its ancestor can be altered. The “stack” layout is the default layout. It stacks genotypes one on top of another to clearly display genotype prevalences at each time point. The “space” layout uses the same stacking method while maintaining (where possible) a minimum amount of space between each genotype. The “centre” layout centers genotypes with respect to their ancestors. Here we’ll see an example of each:
Perturbation events may be added to the TimeScape using the \(perturbations\) parameter. Adding perturbations will simply add a label along the x-axis where the perturbation occurs. The \(perturbations\) parameter is a data frame consisting of the following columns:
E-scape takes as input a clonal phylogeny and clonal prevalences per
clone per sample. At the time of submission many methods have been
proposed for obtaining these values, and accurate estimation of these
quantities is the focus of ongoing research. We describe a method for
estimating clonal phylogenies and clonal prevalence using PyClone (Roth
et al., 2014; source code available at https://bitbucket.org/aroth85/pyclone/wiki/Home) and
citup (Malikic et al., 2016; source code available at https://github.com/sfu-compbio/citup). In brief, PyClone
inputs are prepared by processing fastq files resulting from a targeted
deep sequencing experiment. Using samtools mpileup (http://samtools.sourceforge.net/mpileup.shtml), the
number of nucleotides matching the reference and non-reference are
counted for each targeted SNV. Copy number is also required for each
SNV. We recommend inferring copy number from whole genome or whole exome
sequencing of samples taken from the same anatomic location / timepoint
as the samples to which targeted deep sequencing was applied. Copy
number can be inferred using Titan (Ha et al., 2014; source code
available at https://github.com/gavinha/TitanCNA). Sample specific
SNV information is compiled into a set of TSV files, one per sample. The
tables includes mutation id, reference and variant read counts, normal
copy number, and major and minor tumour copy number (see PyClone
readme). PyClone is run on these files using the
PyClone run_analysis_pipeline
subcommand, and produces the
tables/cluster.tsv
in the working directory. Citup can be
used to infer a clonal phylogeny and clone prevalences from the cellular
prevalences produced by PyClone. The tables/cluster.tsv
file contains per sample, per SNV cluster estimates of cellular
prevalence. The table is reshaped into a TSV file of cellular
prevalences with rows as clusters and columns as samples, and the
mean
of each cluster taken from
tables/cluster.tsv
for the values of the table. The
iterative version of citup is run on the table of cellular frequencies,
producing an hdf5 output results file. Within the hdf5 results, the
/results/optimal
can be used to identify the id of the
optimal tree solution. The clonal phylogeny as an adjacency list is then
the /trees/{tree_solution}/adjacency_list
entry and the
clone frequencies are the /trees/{tree_solution}/clone_freq
entry in the hdf5 file. The adjacency list can be written as a TSV with
the column names source
, target
to be input
into E-scape, and the clone frequencies should be reshaped such that
each row represents a clonal frequency in a specific sample for a
specific clone, with the columns representing the time or space ID, the
clone ID, and the clonal prevalence.
Interactive components:
To view the documentation for TimeScape, type the following command in R:
or:
TimeScape was developed at the Shah Lab for Computational Cancer Biology at the BC Cancer Research Centre.
References:
Ding, Li, et al. “Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing.” Nature 481.7382 (2012): 506-510.
Ha, Gavin, et al. “TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data.” Genome research 24.11 (2014): 1881-1893.
Malikic, Salem, et al. “Clonality inference in multiple tumor samples using phylogeny.” Bioinformatics 31.9 (2015): 1349-1356.
McPherson, Andrew, et al. “Divergent modes of clonal spread and intraperitoneal mixing in high-grade serous ovarian cancer.” Nature genetics (2016).
Roth, Andrew, et al. “PyClone: statistical inference of clonal population structure in cancer.” Nature methods 11.4 (2014): 396-398.