TOP results by CONfident efFECT Size. Topconfects is an R package intended for RNA-seq or microarray Differntial Expression analysis and similar, where we are interested in placing confidence bounds on many effect sizes—one per gene—from few samples, and ranking genes by these confident effect sizes.
Topconfects builds on TREAT p-values offered by the limma and edgeR packages, or the “greaterAbs” test p-values offered by DESeq2. It tries a range of fold changes, and uses this to rank genes by effect size while maintaining a given FDR. This also produces confidence bounds on the fold changes, with adjustment for multiple testing.
A principled way to avoid using p-values as a proxy for effect size. The difference between a p-value of 1e-6 and 1e-9 has no practical meaning in terms of significance, however tiny p-values are often used as a proxy for effect size. This is a misuse, as they might simply reflect greater quality of evidence (for example RNA-seq average read count or microarray average spot intensity). It is better to reject a broader set of hypotheses, while maintaining a sensible significance level.
No need to guess the best fold change cutoff. TREAT requires a fold change cutoff to be specified. Topconfects instead asks you specify a False Discovery Rate appropriate to your purpose. You can then read down the resulting ranked list of genes as far as you wish. The “confect” value given in the last row that you use is the fold change cutoff required for TREAT to produce that set of genes at the given FDR.
The method is described in:
Use limma_confects
, edger_confects
, or
deseq2_confects
as an alternative final step in your limma,
edgeR, or DESeq2 analysis. The limma method is currently much faster
than other methods.
For examples, see the vignette “Confident fold change”.
If you have a collection of effect sizes of some sort, with
associated standard errors, and possibly associated degrees of freedom,
use normal_confects
. Errors are assumed to be normally
distributed, or t-distributed if degrees of freedom are given.
This is a re-implementation of limma’s TREAT method, which is then
supplied to nest_confects
(described next). (Alternatively,
if the effect sizes are all positive, there is an option to use a
one-sided t-test as the underlying hypothesis test.)
The core algorithm of topconfects
is implemented in the
function nest_confects
. You may supply any function that
can calculate p-values for the null hypothesis that an effect size is no
more than a specified amount. Testing is performed for n items, and the
function should be able to perform this calculation for a subset of
these n items and a given amount.
Use confects_plot
to plot confident effect sizes of top
genes. The estimated effect size (eg log fold change) is shown as a dot,
and the confidence bound is shown as a line.
Use confects_plot_me2
to gain a global overview. Similar
to an MD or MA plot, the x axis is average expression and the y axis is
estimated effect size. The confident effect size is shown using colors.
Effect sizes confidently “> 0” correspond to traditional differential
expression testing, and effect sizes confidently greater than larger
values correspond to the TREAT test at that threshold.
There is also an older confects_plot_me
, which I no
longer recommend as it is hard to explain and easily misleading. The y
axis is effect size. Estimated effect sizes are shown in grey and
confident effect sizes in red or blue (ie a gene with a non-NA confident
effect size is shown with both a grey and a colored dot).
Use rank_rank_plot
to compare two rankings.
For examples, see the vignette “Confident fold change”.