Kuijjer Lab

New publication

October 09, 2023

We recently published new work, led by Ping-Han, in Bioinformatics. Ping-Han developed a new normalization method, called SNAIL, which he specifically designed for the pre-processing of large-scale heterogeneous RNA-Seq data prior to network modeling.

    Most normalization methods are designed for downstream analysis of expression, but not co-expression data. We found that normalization can introduce false-positive associations between genes. While this issue can be mitigated by filtering, this leads to information loss, which becomes problematic for heterogeneous datasets. This led us to develop SNAIL.
    SNAIL is specifically designed to normalize RNA-Seq data prior to estimating associations or network inference. The method removes technical variability, while maintaining global differences in expression for samples with different biological attributes. It effectively removes false-positive associations, without the need for filtering. Analyzing data from GTEx and ENCODE, we demonstrate that using SNAIL aids in the detection of, for example, sample-specific network edges, hub genes, and gene function prediction.
    Heterogeneous datasets with increasing numbers of samples and conditions are continuously being generated. We hope that SNAIL will contribute to more precise analyses of such data with association or network-based approaches.
Schematic overview of (a) SNAIL and the analyses performed. (b) Details on the SNAIL algorithm. In short, SNAIL is based on smooth quantile normalization but uses the trimmed mean to derive the quantile distribution for all samples as well as for every biological group of samples. In addition, SNAIL uses the median of the quantiles to normalize the expression for genes with the same read count in one sample.