|
Lizhong Chen |
I am a research officer in Prof. Gordon K Smyth lab team at Bioinformatics & Computational Biology Division, WEHI. My major work is to develop new statistical and computational methods for the analysis of the sequencing data. Previously, I completed my PhD degree in Statistics from the University of Melbourne under the supervision of A/Prof. Guoqi Qian, Prof. Yuriy Kuleshov and Dr. Tingjin Chu in Jul. 2022, where I focused on the feature or variable selection and model averaging for generalized linear models. I received my M.S. and B.S. in Mathematics from Peking University in Jun. 2016 and Jul. 2013, respectively, where my main interest was homotopy theory of spheres and Lie groups.
Ph.D. in Statistics, School of Mathematics and Statistics, The University of Melbourne, Victoria, Australia, Feb. 2017 - Feb. 2022
M.S. in Mathematics, School of Mathematical Science, Peking University, Beijing, China, Sep. 2013 - Jun. 2016
B.S. in Mathematics, Yuanpei College, Peking University, Beijing, China, Sep. 2009 - Jul. 2013
Generalized linear models (GLM), quasi-likelihood and deviance statistics
Feature selection or variable selection method and model averaging
Empirical bayes, prior information estimation and multiple hypothesis tests
Differential analysis of gene expression and applications
Spatial transcriptomics and image analysis
Journal articles
Wang J, Chen L, Brown DV, Chiu C, Speed TP. CMDdemux: an efficient single cell demultiplexing method. Nucleic Acids Research, 54(8), 2026. [DOI]
Baldoni PL#, Chen L#, Li M, Chen Y, Smyth GK. Dividing out quantification uncertainty enables assessment of differential transcript usage with limma and edgeR. Nucleic Acids Research, 53(22), 2025. [DOI]
Chen Y, Chen L, Lun AT, Baldoni PL, Smyth GK. edgeR v4: powerful differential analysis of sequencing data with expanded functionality and improved support for small counts and larger datasets. Nucleic Acids Research, 53(2), 2025. [DOI]
Baldoni PL, Chen L, Smyth GK. Faster and more accurate assessment of differential transcript expression with Gibbs sampling and edgeR v4, NAR Genomics and Bioinformatics, 6(4), 2024. [DOI]
Qian G#, Chen L#, Kuleshov Y. Improving Methodology for Tropical Cyclone Seasonal Forecasting in the Australian and the South Pacific Ocean Regions by Selecting and Averaging Models via Metropolis–Gibbs Sampling, Remote Sensing, 14(22), 2022. [DOI]
Preprint
Wang J, Chen L, Brown DV, Chiu C, Speed TP. CMDdemux: an efficient single cell demultiplexing method. bioRxiv, 2025. [DOI]
Baldoni PL#, Chen L#, Li M, Chen Y, Smyth GK. Dividing out quantification uncertainty enables assessment of differential transcript usage with diffSplice. bioRxiv, 2025. [DOI]
Baldoni PL, Chen L, Smyth GK. Faster and more accurate assessment of differential transcript expression with Gibbs sampling and edgeR 4.0. bioRxiv, 2024. [DOI]
Wang J, Chen L, Thijssen R, Phipson B, Speed TP. GLMsim: a GLM-based single cell RNA-seq simulator incorporating batch and biological effects. bioRxiv, 2024. [DOI]
Chen Y, Chen L, Lun AT, Baldoni PL, Smyth GK. edgeR 4.0: powerful differential analysis of sequencing data with expanded functionality and improved support for small counts and larger datasets. bioRxiv, 2024. [DOI]
Thesis
(Ph.D.) Model selection and averaging by Gibbs sampler with a tropical cyclone seasonal forecasting application, Feb. 2022. [Minerva Access]
(M.S.) A Report on Computations of Cohomology Rings and Homotopy Groups of Lie Groups (in Chinese), Jun. 2016. [PDF]
# contribute equally
edgeR quasi-likelihood: correcting deviance for bias, 2022, WEHI Bioinformatics Division seminar
Extending edgeR for small counts and large samples, 2023, WEHI Bioinformatics Division seminar
Applying edgeR for alternative splicing analysis, 2024, WEHI Bioinformatics Division seminar [Slides]
edgeRv4 with expanded functionality and improved support for small counts and larger datasets, 2025, EuroBioC2025 [Slides]
Analyzing paired count data using edgeR, 2025, WEHI Bioinformatics Division seminar [Slides]
R package: edgeR (Empirical Analysis of Digital Gene Expression Data in R)
edgeR is a Bioconductor R package to perform differential expression analysis for the sequencing data like RNA-seq, ChIP-seq and ATAC-seq.
In edgeR v4, we develop the new quasi-likelihood method based on the quadratic mean-variance relationship so that we can isolate the biological and technical variation.
We improve the quasi-dispersion estimation by the adjusted deviance statistics and support lowly expressed counts data.
It is available on R-Bioconductor.
R package: limma (Linear Models for Microarray and Omics Data)
limma is a Bioconductor R package to perform differential expression analysis for the microarray, sequencing and omics data.
We update the estimation of the prior distribution in empirical Bayes approach by a two-steps method, combining the local moment estimates and profile likelihood estimates.
It improves the performance when the observed degrees of freedom from the models are unequal and relatively small.
It is available on R-Bioconductor.
R package: statmod (statmod: Statistical Modeling)
statmod is a CRAN R package that is a collection of statistical tools.
We introduce a new function expectedDeviance to provide an approximation to the expected unit deviance and degree of freedom for Poisson, Binomial, and Negative Binomial distribution,
which corrects the bias of chi-square distribution approximation of 1 degree of freedom.
It is available on R-CRAN.
R package: IBGS (Iterated Blockwise Gibbs Sampler)
IBGS performs variable selection for generalized linear models and the Cox proportional-hazards model when the number of predictors is very large.
The sampler is implemented in C with parallel block screening through OpenMP, and returns a small set of high-scoring models together with marginal inclusion probabilities and model-averaged predictions.
It is available on Github.
R package: fastLISA (Fast Local Indicators of Spatial Association)
fastLISA computes seven families of LISA statistics with a plain-C backend, optional OpenMP multi-threading, and a modern xoshiro256 random number generator for permutation-based inference.
It accepts any spdep listw spatial weights object — including custom and non-contiguity (e.g. distance-decay) weights — and returns compact, inspectable, spdep-compatible matrices.
It is available on Github.
R package: DGQ (Dynamic Geometric Quantiles for Multivariate Time Series)
DGQ computes a quantile-like trajectory for a collection of multivariate time series.
The main function, DGQ(), returns both a constrained empirical DGQ that is an observed series and an unconstrained time-wise geometric quantile.
It is available on Github.
Hasini Herath, PhD (Medical Biology), University of Melbourne, co-supervised with Pedro Baldoni and Gordon Smyth, 2026-
Lei Qin, PhD (Medical Biology), University of Melbourne, co-supervised with Yunshun Chen and Gordon Smyth, 2024-
Xueming Li, PhD (Mathematics & Statistics), University of Melbourne, co-supervised with Guoqi Qian, 2025-
Shama Deb, Master (Mathematics & Statistics), University of Melbourne, co-supervised with Gordon Smyth, 2025-2026
Yuchen Liu, Master (Mathematics & Statistics), University of Melbourne, co-supervised with Tingjin Chu, 2025-2026