preciseTAD

preciseTAD: A machine learning framework for precise TAD boundary prediction


Bioconductor version: Release (3.20)

preciseTAD provides functions to predict the location of boundaries of topologically associated domains (TADs) and chromatin loops at base-level resolution. As an input, it takes BED-formatted genomic coordinates of domain boundaries detected from low-resolution Hi-C data, and coordinates of high-resolution genomic annotations from ENCODE or other consortia. preciseTAD employs several feature engineering strategies and resampling techniques to address class imbalance, and trains an optimized random forest model for predicting low-resolution domain boundaries. Translated on a base-level, preciseTAD predicts the probability for each base to be a boundary. Density-based clustering and scalable partitioning techniques are used to detect precise boundary regions and summit points. Compared with low-resolution boundaries, preciseTAD boundaries are highly enriched for CTCF, RAD21, SMC3, and ZNF143 signal and more conserved across cell lines. The pre-trained model can accurately predict boundaries in another cell line using CTCF, RAD21, SMC3, and ZNF143 annotation data for this cell line.

Author: Spiro Stilianoudakis [aut], Mikhail Dozmorov [aut, cre]

Maintainer: Mikhail Dozmorov <mikhail.dozmorov at gmail.com>

Citation (from within R, enter citation("preciseTAD")):

Installation

To install this package, start R (version "4.4") and enter:


if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("preciseTAD")

For older versions of R, please refer to the appropriate Bioconductor release.

Documentation

To view documentation for the version of this package installed in your system, start R and enter:

browseVignettes("preciseTAD")
preciseTAD HTML R Script
Reference Manual PDF
NEWS Text
LICENSE Text

Details

biocViews Classification, Clustering, FeatureExtraction, FunctionalGenomics, HiC, Sequencing, Software
Version 1.16.0
In Bioconductor since BioC 3.12 (R-4.0) (4 years)
License MIT + file LICENSE
Depends R (>= 4.1)
Imports S4Vectors, IRanges, GenomicRanges, randomForest, ModelMetrics, e1071, PRROC, pROC, caret, utils, cluster, dbscan, doSNOW, foreach, pbapply, stats, parallel, gtools, rCGH
System Requirements
URL https://github.com/dozmorovlab/preciseTAD
Bug Reports https://github.com/dozmorovlab/preciseTAD/issues
See More
Suggests knitr, rmarkdown, testthat, BiocCheck, BiocManager, BiocStyle
Linking To
Enhances
Depends On Me
Imports Me
Suggests Me preciseTADhub
Links To Me
Build Report Build Report

Package Archives

Follow Installation instructions to use this package in your R session.

Source Package preciseTAD_1.16.0.tar.gz
Windows Binary (x86_64) preciseTAD_1.16.0.zip
macOS Binary (x86_64) preciseTAD_1.16.0.tgz
macOS Binary (arm64) preciseTAD_1.16.0.tgz
Source Repository git clone https://git.bioconductor.org/packages/preciseTAD
Source Repository (Developer Access) git clone git@git.bioconductor.org:packages/preciseTAD
Bioc Package Browser https://code.bioconductor.org/browse/preciseTAD/
Package Short Url https://bioconductor.org/packages/preciseTAD/
Package Downloads Report Download Stats