
|
Lizhong Chen |
I am a research officer in Prof. Gordon K Smyth lab team at Bioinformatics & Computationl Biology Division, WEHI. My major work is to develop new statistical and computational methods for the analysis of the sequencing data. Previously, I completed my PhD degree in Statistics from the University of Melbourne under the supervision of A/Prof. Guoqi Qian, Prof. Yuriy Kuleshov and Dr. Tingjin Chu in Jul. 2022, where I focused on the feature or variable selection and model averaging for generalized linear models. I received my M.S. and B.S. in Mathematics from Peking University in Jun. 2016 and Jul. 2013, respectively, where my main interest was homotopy theory of spheres and Lie groups.
Ph.D. in Statistics, School of Mathematics and Statistics, The University of Melbourne, Victoria, Australia, Feb. 2017 - Feb. 2022
M.S. in Mathematics, School of Mathematical Science, Peking University, Beijing, China, Sep. 2013 - Jun. 2016
B.S. in Mathematics, Yuanpei College, Peking University, Beijing, China, Sep. 2009 - Jul. 2013
Generalized linear models (GLM), quasi-likelihood and deviacne statistics
Feature selection or variable selection method and model averaging
Empirical bayes, prior information estimation and mulitple hypothesis tests
Differential analysis of gene expression and applications
Spatial transcripts and image analysis
Journal articles
Wang J, Chen L, Brown DV, Chiu C, Speed TP. CMDdemux: an efficient single cell demultiplexing method. Nucleic Acids Research, 54(8) 2026. doi
Baldoni PL#, Chen L#, Li M, Chen Y, Smyth GK. Dividing out quantification uncertainty enables assessment of differential transcript usage with limma and edgeR. Nucleic Acids Research, 53(22) 2025. doi
Chen Y, Chen L, Lun AT, Baldoni PL, Smyth GK. edgeR v4: powerful differential analysis of sequencing data with expanded functionality and improved support for small counts and larger datasets. Nucleic Acids Research, 53(2), 2025. doi
Baldoni PL, Chen L, Smyth GK. Faster and more accurate assessment of differential transcript expression with Gibbs sampling and edgeR v4, NAR Genomics and Bioinformatics, 6(4), 2024. doi
Qian G#, Chen L#, Kuleshov Y. Improving Methodology for Tropical Cyclone Seasonal Forecasting in the Australian and the South Pacific Ocean Regions by Selecting and Averaging Models via Metropolis–Gibbs Sampling, Remote Sensing, 14(22), 2022. doi
Preprint
Wang J, Chen L, Brown DV, Chiu C, Speed TP. CMDdemux: an efficient single cell demultiplexing method. bioRxiv, 2025. doi
Baldoni PL#, Chen L#, Li M, Chen Y, Smyth GK. Dividing out quantification uncertainty enables assessment of differential transcript usage with diffSplice. bioRxiv, 2025. doi
Baldoni PL, Chen L, Smyth GK. Faster and more accurate assessment of differential transcript expression with Gibbs sampling and edgeR 4.0. bioRxiv, 2024. doi
Wang J, Chen L, Thijssen R, Phipson B, Speed TP. GLMsim: a GLM-based single cell RNA-seq simulator incorporating batch and biological effects. bioRxiv, 2024. doi
Chen Y, Chen L, Lun AT, Baldoni PL, Smyth GK. edgeR 4.0: powerful differential analysis of sequencing data with expanded functionality and improved support for small counts and larger datasets. bioRxiv, 2024. doi
Thesis
(Ph.D.) Model selection and averaging by Gibbs sampler with a tropical cyclone seasonal forecasting application, Feb. 2022. [Minerva Access]
(M.S.) A Report on Computations of Cohomology Rings and Homotopy Groups of Lie Groups (in Chinese), Jun. 2016. [PDF]
# contribute equally
edgeR quasi-likelihood: correcting deviance for bias, 2022, WEHI Bioinformatics Divsion seminar
Extending edgeR for small counts and large samples, 2023, WEHI Bioinformatics Divsion seminar
Applying edgeR for alternative splicing analysis, 2024, WEHI Bioinformatics Divsion seminar [Slides]
edgeRv4 with expanded functionality and improved support for small counts and larger datasets, 2025, EuroBioC2025 [Slides]
Analyzing paired count data using edgeR, 2025, WEHI Bioinformatics Divsion seminar [Slides]
R package: edgeR (Empirical Analysis of Digital Gene Expression Data in R)
edgeR is a Bioconductor R package to perform differential expression analysis for the sequencing data like RNA-seq, ChIP-seq and ATAC-seq.
In edgeR v4, we develop the new quasi-likelihood method based on the quadratic mean-variance relationship that we can isolate the biological and technical variation.
We improve the quasi-dispersion estimation by the adjusted deviance statistics and support lowly expressed counts data.
It is available on R-Bioconductor.
R package: limma (Linear Models for Microarray and Omics Data)
limma is a Bioconductor R package to perform differential expression analysis for the microarray, sequencing and omics data.
We update the estimation of the prior distribution in empirical Bayes approach by a two-steps method, combining the local moment estimates and profile likelihood estimates.
It improves the performance when the observed degrees of freedom from the models are unequal and relatively small.
It is available on R-Bioconductor.
R package: statmod (statmod: Statistical Modeling)
statmod is a CRAN R package that is a collection of statistical tools.
We introduce a new function expectedDeviance to provide an approximation to the expected unit deviance and degree of freedom for Poisson, Binomial, and Negative Binomial distribution,
which corrects the bias of chi-square distribution approximation of 1 degree of freedom
It is available on R-CRAN.
R package: IBGS (Iterated Blockwise Gibbs Sampler)
IBGS is an MCMC search algorithm to find the best model in the high dimensional data given a model selection criterion such as AIC, BIC and so on.
Also, it can identify those most important covariates having the significant influence on the response.
It is available on Github.
Hasini Herath, PhD (Medical Biology), University of Melbourne, co-supervised with Pedro Baldoni and Gordon Smyth
Lei Qin, PhD (Medical Biology), University of Melbourne, co-supervised with Yunshun Chen and Gordon Smyth
Xueming Li, PhD (Mathematics & Statistics), University of Melbourne, co-supervised with Guoqi Qian
Shama Deb, Master (Mathematics & Statistics), University of Melbourne, co-supervised with Gordon Smyth