Contents

1 Introduction

LipidTrend is an R package designed to identify significant differences in lipidomic feature tendencies between groups. The package includes 3 main functions:

  1. analyzeLipidRegion – Performs statistical analysis to examine lipid feature tendencies.
  2. plotRegion1D – Visualizes results for one-dimensional lipid features.
  3. plotRegion2D – Visualizes results for two-dimensional lipid features.

Additionally, several helper functions are available to facilitate the viewing of result data in the return object. For more details, please refer to helper function section.

2 Installation

To install LipidTrend, ensure that you have R 4.5.0 or later installed (see the R Project at http://www.r-project.org) and are familiar with its usage.

LipidTrend package is available on Bioconductor repository http://www.bioconductor.org. Before installing LipidTrend, you must first install the core Bioconductor packages. If you have already installed them, you can skip the following step.

if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install()

Once the core Bioconductor packages are installed, you can proceed with installing LipidTrend.

if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install("LipidTrend")

After the installation is complete, you’re ready to start using LipidTrend. Now, let’s load the package first.

library(LipidTrend)

3 Data preparation

LipidTrend requires a SummarizedExperiment object as input data. It must contain the following components:

  1. Assay: A matrix containing lipid abundance data, where rows represent lipids and columns represent samples.

  2. RowData: A data frame of lipid features (e.g., double bond count, chain length), where rows are lipids and columns are lipid features (limited to 1 or 2 columns). The order of lipids must match the abundance data.

    • If RowData contains one column, a one-dimensional analysis will be performed.
    • If RowData includes two columns, a two-dimensional analysis will be conducted.
  3. ColData: A data frame containing group information, where rows represent sample names and columns must include sample name, label name, and group, arranged accordingly.

If you are already familiar with constructing a SummarizedExperiment object, you can skip the following section. Otherwise, refer to the example in the rest of this section to learn how to build a SummarizedExperiment object.

3.1 Abundance data

The abundance data is a matrix containing lipid abundance values across lipids and samples, where rows represent lipids and columns represent samples.

# load example abundance data
data("abundance_2D")

# view abundance data
head(abundance_2D, 5)
#>         HSD17B12KO01 HSD17B12KO02 HSD17B12KO03    sgCtrl01   sgCtrl02
#> TG 33:1    0.7591750    0.3109753    0.4456624 0.008158902 0.04569340
#> TG 36:1    3.2885277    2.7623366    3.0669865 0.193283659 0.25024117
#> TG 36:2    1.9974341    1.3529295    1.5173635 0.187196921 0.28231711
#> TG 37:2    1.7893400    0.8872451    1.0449693 0.121982450 0.71770893
#> TG 38:0    0.3413125    0.3504082    0.3784324 0.031840554 0.04336764
#>           sgCtrl03
#> TG 33:1 0.08986397
#> TG 36:1 0.36918016
#> TG 36:2 0.44319480
#> TG 37:2 1.13574059
#> TG 38:0 0.05876135

3.2 Lipid characteristic table

The lipid characteristic table is a data frame containing information about each lipid’s characteristics, such as the number of double bonds and chain length. The order of the lipids in this table must align with the abundance data.

A one-dimensional analysis will be conducted if the table has only one column, and a two-dimensional analysis will be performed if it contains two columns. The table can only have a maximum of two columns.

In this example, we use data suitable for a two-dimensional analysis.

# load example lipid characteristic table (2D)
data("char_table_2D")

# view lipid characteristic table
head(char_table_2D, 5)
#>         Total.C Total.DB
#> TG 33:1      33        1
#> TG 36:1      36        1
#> TG 36:2      36        2
#> TG 37:2      37        2
#> TG 38:0      38        0

3.3 Group information table

The group information table is a data frame containing grouping details corresponding to the samples in the lipid abundance data. It must adhere to the following requirements:

  1. The columns must be arranged in the following order: sample_name, label_name, and group.
  2. All sample names must be unique.
  3. The sample names in the sample_name column must match those in the lipid abundance data.
  4. The columns sample_name, label_name, and group must not contain NA values.

For example:

# load example group information table
data("group_info")

# view group information table
group_info
#>    sample_name   label_name      group
#> 1 HSD17B12KO01 HSD17B12KO01 HSD17B12KO
#> 2 HSD17B12KO02 HSD17B12KO02 HSD17B12KO
#> 3 HSD17B12KO03 HSD17B12KO03 HSD17B12KO
#> 4     sgCtrl01     sgCtrl01     sgCtrl
#> 5     sgCtrl02     sgCtrl02     sgCtrl
#> 6     sgCtrl03     sgCtrl03     sgCtrl

3.4 Constructing SummarizedExperiment object

Once the abundance data, lipid characteristic table, and group information table are prepared, we can construct the input SummarizedExperiment object. We will use the SummarizedExperiment function from SummarizedExperiment.

Follow the command below to create this object.

se_2D <- SummarizedExperiment::SummarizedExperiment(
    assays=list(abundance=abundance_2D),
    rowData=S4Vectors::DataFrame(char_table_2D),
    colData=S4Vectors::DataFrame(group_info))

4 Initiate LipidTrend workflow

The workflow of LipidTrend is primarily divided into one-dimensional and two-dimensional lipid feature analyses. The process begins with a SummarizedExperiment object as input data and, after computation and plotting, respectively, returns a statistical results data frame and plot.

We recommend using the set.seed() function before starting to ensure stability in the permutation process during computation.

set.seed(1234)

4.1 One-dimensional analysis

The one-dimensional analysis is applicable when the input data contains only a single lipid characteristic. Let’s briefly view the structure of the example input data.

# load example data
data("lipid_se_CL")

# quick look of SE structure
show(lipid_se_CL)
#> class: SummarizedExperiment 
#> dim: 29 6 
#> metadata(0):
#> assays(1): abundance
#> rownames(29): 33 36 ... 62 64
#> rowData names(1): chain
#> colnames: NULL
#> colData names(3): sample_name label_name group

4.1.1 Analyze lipid region

This section calculate p-value of lipidomic features by permutation test. The test of region statistics smoothing with Gaussian kernel integrates neighbor’s information and provides a stable testing result under small sample size.

You can obtain separate results for even-chain and odd-chain lipids by adequately configuring the variables split_chain and chain_col.

If split_chain is set to TRUE, the results will be divided into even-chain and odd-chain lipids. By default, split_chain is set to FALSE; however, we recommend setting it to TRUE for more meaningful biological insights.

Note:
- When split_chain=TRUE, you must specify the column name for chain length in the chain_col variable.
- If split_chain=FALSE, then chain_col should be NULL.

# run analyzeLipidRegion
res1D <- analyzeLipidRegion(
    lipid_se_CL, ref_group="sgCtrl", split_chain=TRUE,
    chain_col="chain", radius=3, own_contri=0.5, permute_time=1000)

# view result summary 
show(res1D)
#> class: LipidTrendSE 
#> dim: 29 6 
#> metadata(0):
#> assays(1): abundance
#> rownames(29): 33 36 ... 62 64
#> rowData names(1): chain
#> colnames: NULL
#> colData names(3): sample_name label_name group
#> 
#> LipidTrend Results:
#> ------------------------
#> Split chain analysis: Yes 
#> Even chain result: 15 features
#> Odd chain result: 14 features

The analyzeLipidRegion function produces an extended SummarizedExperiment object called LipidTrendSE. To facilitate result extraction, we offer several helper functions that make viewing the resulting data frame easier. For more information on these helper functions, please refer to helper function section.

# view even chain result (first 5 lines)
head(even_chain_result(res1D), 5)
#>    chain avg.abund.ctrl avg.abund.case direction smoothing.pval.BH
#> 36    36      0.5751379       4.661859         +             0.001
#> 38    38      0.7665390       3.708691         +             0.001
#> 40    40      0.9500241       6.091883         +             0.001
#> 42    42      4.4355047      20.377929         +             0.001
#> 44    44     19.1928487      61.133996         +             0.001
#>    marginal.pval.BH  log2.FC
#> 36     4.088307e-04 3.018926
#> 38     3.011445e-04 2.274479
#> 40     1.205712e-05 2.680852
#> 42     2.397807e-05 2.199837
#> 44     6.578614e-05 1.671406

# view odd chain result (first 5 lines)
head(odd_chain_result(res1D), 5)
#>    chain avg.abund.ctrl avg.abund.case direction smoothing.pval.BH
#> 33    33     0.04790542      0.5052709         +       0.001076923
#> 37    37     0.65847733      1.2405181         +       0.001076923
#> 39    39     0.11565136      1.2085836         +       0.001076923
#> 41    41     0.41424681      4.3259329         +       0.001076923
#> 43    43     1.80730482     14.2470104         +       0.001076923
#>    marginal.pval.BH   log2.FC
#> 33     3.207342e-02 3.3987963
#> 37     2.238917e-01 0.9137372
#> 39     2.236234e-05 3.3854631
#> 41     2.225308e-05 3.3844488
#> 43     2.237207e-05 2.9787475

4.1.2 Result visualization

This section utilizes the LipidTrendSE output from the analyzeLipidRegion function to visualize the results.

Note: - If split_chain is set to TRUE, two plots will be generated: one for even-chain lipids and another for odd-chain lipids. - If split_chain is set to FALSE, only a single plot will be returned.

# plot result
plots <- plotRegion1D(res1D, p_cutoff=0.05, y_scale='identity')

# even chain result
plots$even_result


# odd chain result
plots$odd_result

The visualization illustrates lipid trends and includes the following components:

  1. Blue/Red Ribbons – Highlight significant regions of difference between groups.
  2. Points – Display average abundance of each lipid across all samples.

Color Interpretation:

  • Red areas indicate higher abundance in case samples.
  • Blue areas indicate higher abundance in control samples.

4.2 Two-dimensional analysis

The main difference between one-dimensional and two-dimensional analysis lies in the provided lipid characteristics table. One-dimensional analysis focuses on a single lipid feature, whereas two-dimensional analysis examines two lipid features simultaneously. The two-dimensional approach is applicable when the input data includes two lipid characteristics.

Now, let’s take a quick look at the structure of the example input data.

# load example data
data("lipid_se_2D")

# quick look of SE structure
show(lipid_se_2D)
#> class: SummarizedExperiment 
#> dim: 137 6 
#> metadata(0):
#> assays(1): abundance
#> rownames(137): TG 33:1 TG 36:1 ... TG 64:5 TG 64:8
#> rowData names(2): Total.C Total.DB
#> colnames: NULL
#> colData names(3): sample_name label_name group

4.2.1 Analyze lipid region

This section calculate p-value of lipidomic features by permutation test. The test of region statistics smoothing with Gaussian kernel integrates neighbor’s information and provides a stable testing result under small sample size.

You can obtain separate results for even-chain and odd-chain lipids by adequately configuring the variables split_chain and chain_col.

If split_chain is set to TRUE, the results will be divided into even-chain and odd-chain lipids. By default, split_chain is set to FALSE; however, we recommend setting it to TRUE for more meaningful biological insights.

Note:
- When split_chain=TRUE, you must specify the column name for chain length in the chain_col variable.
- If split_chain=FALSE, then chain_col should be NULL.

# run analyzeLipidRegion
res2D <- analyzeLipidRegion(
    lipid_se_2D, ref_group="sgCtrl", split_chain=TRUE,
    chain_col="Total.C", radius=3, own_contri=0.5, permute_time=1000,
    abund_weight=TRUE)

# view result summary 
show(res2D)
#> class: LipidTrendSE 
#> dim: 137 6 
#> metadata(0):
#> assays(1): abundance
#> rownames(137): TG 33:1 TG 36:1 ... TG 64:5 TG 64:8
#> rowData names(2): Total.C Total.DB
#> colnames: NULL
#> colData names(3): sample_name label_name group
#> 
#> LipidTrend Results:
#> ------------------------
#> Split chain analysis: Yes 
#> Even chain result: 81 features
#> Odd chain result: 56 features

The analyzeLipidRegion function produces an extended SummarizedExperiment object called LipidTrendSE. To facilitate result extraction, we offer several helper functions that make viewing the resulting data frame easier. For more information on these helper functions, please refer to helper function section.

# view even chain result (first 5 lines)
head(even_chain_result(res2D), 5)
#>         Total.C Total.DB avg.abund direction smoothing.pval.BH marginal.pval.BH
#> TG 36:1      36        1 1.6550926         +       0.001038462     1.816118e-04
#> TG 36:2      36        2 0.9634060         +       0.001038462     3.624586e-03
#> TG 38:0      38        0 0.2006871         +       0.001038462     7.922194e-05
#> TG 38:1      38        1 0.8449176         +       0.001038462     1.444844e-04
#> TG 38:2      38        2 0.9041542         +       0.001038462     1.696836e-03
#>          log2.FC
#> TG 36:1 3.487890
#> TG 36:2 2.415022
#> TG 38:0 2.997840
#> TG 38:1 2.970089
#> TG 38:2 1.708606

# view odd chain result (first 5 lines)
head(odd_chain_result(res2D), 5)
#>         Total.C Total.DB avg.abund direction smoothing.pval.BH marginal.pval.BH
#> TG 33:1      33        1 0.2765882         +       0.001056604     2.904763e-02
#> TG 37:2      37        2 0.9494977         +       0.105777778     2.279624e-01
#> TG 39:0      39        0 0.1750303         +       0.001056604     1.278400e-04
#> TG 39:1      39        1 0.4870872         +       0.001056604     6.936546e-05
#> TG 41:0      41        0 0.5079328         +       0.001056604     9.512632e-05
#>           log2.FC
#> TG 33:1 3.3987963
#> TG 37:2 0.9137372
#> TG 39:0 3.8873738
#> TG 39:1 3.2356822
#> TG 41:0 2.8007824

4.2.2 Result visualization

This section utilizes the LipidTrendSE output from the analyzeLipidRegion function to visualize the results.

Note: - If split_chain is set to TRUE, two plots will be generated: one for even-chain lipids and another for odd-chain lipids. - If split_chain is set to FALSE, only a single plot will be returned.

# plot result
plot2D <- plotRegion2D(res2D, p_cutoff=0.05)

# even chain result
plot2D$even_result


# odd chain result
plot2D$odd_result

This plot visualizes two-dimensional lipid features, highlighting trends and significant differences between case and control groups.

The X-axis represents the total carbon chain length, while the Y-axis shows the total double bond count. The color scale indicates the log2 fold-change (FC) between the case and control groups: red regions indicate higher abundance in case samples, whereas blue regions indicate higher abundance in control samples.

Circle size represents the average abundance level, with larger circles indicating higher abundance. Red and blue outlines highlight regions with significant differences between the case and control groups. Asterisks (*, **, ***) signify levels of statistical significance, with more asterisks representing greater significance.

Through this visualization, you can identify lipidomic patterns and differences across various lipid characteristics.

5 Helper Functions – enhancing result viewing

LipidTrend provides 4 helper functions to enhance the viewing of the LipidTrendSE object returned by analyzeLipidRegion:

  1. result – Returns the result data frame.
  2. even_chain_result – Returns the even-chain result data frame.
  3. odd_chain_result – Returns the odd-chain result data frame.
  4. show – Displays a summary of the LipidTrendSE object.

Notes:

6 Session info

#> R version 4.5.0 (2025-04-11)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.2 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.22-bioc/R/lib/libRblas.so 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/New_York
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] LipidTrend_0.99.1 BiocStyle_2.37.0 
#> 
#> loaded via a namespace (and not attached):
#>  [1] sass_0.4.10                 generics_0.1.4             
#>  [3] SparseArray_1.9.0           lattice_0.22-7             
#>  [5] digest_0.6.37               magrittr_2.0.3             
#>  [7] RColorBrewer_1.1-3          evaluate_1.0.3             
#>  [9] grid_4.5.0                  bookdown_0.43              
#> [11] fastmap_1.2.0               jsonlite_2.0.0             
#> [13] Matrix_1.7-3                ggnewscale_0.5.1           
#> [15] GenomeInfoDb_1.45.3         tinytex_0.57               
#> [17] BiocManager_1.30.25         httr_1.4.7                 
#> [19] scales_1.4.0                UCSC.utils_1.5.0           
#> [21] jquerylib_0.1.4             abind_1.4-8                
#> [23] cli_3.6.5                   rlang_1.1.6                
#> [25] crayon_1.5.3                XVector_0.49.0             
#> [27] Biobase_2.69.0              withr_3.0.2                
#> [29] cachem_1.1.0                DelayedArray_0.35.1        
#> [31] yaml_2.3.10                 S4Arrays_1.9.0             
#> [33] tools_4.5.0                 dplyr_1.1.4                
#> [35] ggplot2_3.5.2               SummarizedExperiment_1.39.0
#> [37] BiocGenerics_0.55.0         vctrs_0.6.5                
#> [39] R6_2.6.1                    magick_2.8.6               
#> [41] matrixStats_1.5.0           stats4_4.5.0               
#> [43] lifecycle_1.0.4             S4Vectors_0.47.0           
#> [45] IRanges_2.43.0              pkgconfig_2.0.3            
#> [47] bslib_0.9.0                 pillar_1.10.2              
#> [49] gtable_0.3.6                Rcpp_1.0.14                
#> [51] glue_1.8.0                  xfun_0.52                  
#> [53] tibble_3.2.1                GenomicRanges_1.61.0       
#> [55] tidyselect_1.2.1            dichromat_2.0-0.1          
#> [57] MatrixGenerics_1.21.0       knitr_1.50                 
#> [59] farver_2.1.2                htmltools_0.5.8.1          
#> [61] labeling_0.4.3              rmarkdown_2.29             
#> [63] compiler_4.5.0