GTestimate: improving relative gene expression estimation in scRNA-seq using the Good–Turing estimator
Background Single-cell RNA-seq suffers from unwanted technical variation between cells, caused by its complex experiments and shallow sequencing depths. Many conventional normalization methods try to remove this variation by calculating the relative gene expression per cell. However, their choice of the maximum likelihood estimator is not ideal for this application. Results We present GTestimate, a new normalization method based on the Good–Turing estimator, which improves upon conventional normalization methods by accounting for unobserved genes. To validate GTestimate, we developed a novel cell-targeted PCR amplification approach (cta-seq), which enables ultra-deep sequencing of single cells. Based on these data, we show that the Good–Turing estimator improves relative gene expression estimation and cell–cell distance estimation. Finally, we use GTestimate’s compatibility with Seurat workflows to explore 4 example datasets and show how it can improve downstream results. Conclusion By choosing a more suitable estimator for the relative gene expression per cell, we were able to improve scRNA-seq normalization, with potentially large implications for downstream results. GTestimate is available as an easy-to-use R-package and compatible with a variety of workflows, which should enable widespread adoption.
Top
- Fahrenberger, Martin
- von Haeseler, Arndt
- Esk, Christopher
- Knoblich, Jürgen A.
Top
Category |
Journal Paper |
Divisions |
Bioinformatics and Computational Biology |
Journal or Publication Title |
GigaScience |
ISSN |
2047-217X |
Publisher |
Oxford University Press |
Place of Publication |
GigaScience Press, BGI Hong Kong Tech Co Ltd. |
Volume |
14 |
Date |
8 October 2025 |
Export |
Top
