Top 94 Statistics and Computing papers published in 2025

TL;DR: A novel p-value-based multiple testing approach is introduced for generalized linear models, addressing FDR control amidst dependent test statistics, with efficient algorithms and theoretical analysis affirming its performance across diverse simulation settings.

...read moreread less

Abstract: This study introduces a novel p-value-based multiple testing approach tailored for generalized linear models. Despite the crucial role of generalized linear models in statistics, existing methodologies face obstacles arising from the heterogeneous variance of response variables and complex dependencies among estimated parameters. Our aim is to address the challenge of controlling the false discovery rate (FDR) amidst arbitrarily dependent test statistics. Through the development of efficient computational algorithms, we present a versatile statistical framework for multiple testing. The proposed framework accommodates a range of tools developed for constructing a new model matrix in regression-type analysis, including random row permutations and Model-X knockoffs. We devise efficient computing techniques to solve the encountered non-trivial quadratic matrix equations, enabling the construction of paired p-values suitable for the two-step multiple testing procedure proposed by Sarkar and Tang (Biometrika 109(4): 1149–1155, 2022). Theoretical analysis affirms the properties of our approach, demonstrating its capability to control the FDR at a given level. Empirical evaluations further substantiate its promising performance across diverse simulation settings.

...read moreread less

Journal Article•10.1007/s11222-025-10819-z•

Removal of redundant candidate points for the exact D-optimal design problem

[...]

Radoslav Harman¹, Samuel Colón De La Rosa•Institutions (1)

Comenius University in Bratislava¹

31 Aug 2025-Statistics and Computing

TL;DR: Researchers propose a method to efficiently eliminate redundant candidate points for D-optimal exact design problems, reducing memory and runtime requirements, and enabling computation of optimal designs for large-scale problems via mixed-integer second-order cone programming.

...read moreread less

Abstract: One of the most common problems in statistical experimentation is computing D-optimal designs for linear or locally linearized models on large finite candidate sets. While optimal approximate designs can be efficiently computed using convex methods, constructing optimal exact designs with a prespecified total number of trials is a substantially more difficult integer optimization problem. In this paper, we propose necessary conditions, based on approximate designs, that must be satisfied by any support point of a D-optimal exact design. These conditions enable rapid elimination of redundant candidate points without loss of optimality, thereby reducing memory requirements and runtime of subsequent exact-design algorithms. We also prove that, for a sufficiently large number of trials, the support of every D-optimal exact design is contained in a set that typically coincides with the support of a D-optimal approximate design. We demonstrate the approach on randomly generated benchmark models with candidate sets of up to 100 million points and on commonly used constrained mixture models with up to 1 million points. The proposed approach reduces the initial candidate sets by several orders of magnitude, thereby making it possible to compute D-optimal exact designs for these problems via mixed-integer second-order cone programming, which provides optimality guarantees.

...read moreread less

Journal Article•10.1007/s11222-025-10790-9•

Penalized distributed lag non-linear models for small area data using Laplacian-P-splines

[...]

S. Rutten, Bryan P. Sumalinab, Oswaldo Gressani¹, Thomas Neyens², Elisa Duarte, Niel Hens, C. Faes - Show less +3 more•Institutions (2)

Université catholique de Louvain¹, University of Hasselt²

05 Dec 2025-Statistics and Computing

TL;DR: This study proposes a novel Bayesian DLNM-Laplacian-P-splines approach to model nonlinear lagged relationships in spatially referenced data, incorporating spatial dependence using CAR priors and Laplace approximation for improved computational efficiency.

...read moreread less

Abstract: Distributed lag non-linear models (DLNMs) have gained popularity for modeling nonlinear lagged relationships between exposures and outcomes. When applied to spatially referenced data, these models must account for spatial dependence, a challenge that has yet to be thoroughly explored within the penalized DLNM framework. This gap is mainly due to the complex model structure and high computational demands, particularly when dealing with large spatio-temporal datasets. To address this, we propose a novel Bayesian DLNM-Laplacian-P-splines (DLNM-LPS) approach that incorporates spatial dependence using conditional autoregressive (CAR) priors, a method commonly applied in disease mapping. Our approach offers a flexible framework for capturing nonlinear associations while accounting for spatial dependence. It uses the Laplace approximation to approximate the conditional posterior distribution of the regression parameters, eliminating the need for Markov chain Monte Carlo (MCMC) sampling, often used in Bayesian inference, thus improving computational efficiency. The methodology is evaluated through simulation studies and applied to analyze the relationship between temperature and mortality in London.

...read moreread less

Journal Article•10.1007/s11222-025-10794-5•

Properties and Estimation of a Novel Skewed Generalized Normal Distribution

[...]

Xue Wang, Weihu Cheng, Minghan Wang, Xusheng Zhao¹•Institutions (1)

Huazhong University of Science and Technology¹

16 Dec 2025-Statistics and Computing

Journal Article•10.1007/s11222-025-10772-x•

Sparse Bayesian learning for label efficiency in cardiac real-time MRI

[...]

Anja Bach, Achim Basermann, Darius A. Gerlach, Philipp Knechtges, Jens Tank, Raúl Tempone, Felix Terhag - Show less +3 more

27 Mar 2025-Statistics and Computing

Journal Article•10.1007/s11222-025-10768-7•

Unbiased parameter estimation for bayesian inverse problems

[...]

Neil K. Chada¹, Ajay Jasra, Mohamed Maama, Raúl Tempone•Institutions (1)

King Abdullah University of Science and Technology¹

06 Feb 2025-Statistics and Computing

Journal Article•10.1007/s11222-025-10602-0•

Sparse and debiased Lasso estimation and statistical inference for long time series via divide-and-conquer

[...]

Heng Lian

21 Mar 2025-Statistics and Computing

Journal Article•10.1007/s11222-025-10607-9•

Debiased transfer learning estimation and inference for multinomial regression

[...]

Jichen Yang, Lei Wang, Heng Lian

26 Mar 2025-Statistics and Computing

Journal Article•10.1007/s11222-025-10779-4•

Optimal sparse phase retrieval via a quasi-Bayesian approach

[...]

T. Tien Mai¹•Institutions (1)

University of Oslo¹

13 Apr 2025-Statistics and Computing

Journal Article•10.1007/s11222-025-10773-w•

Relative error model average for multiplicative models

[...]

Xiaochao Xia, Hao Ming, Jialiang Li

12 Nov 2025-Statistics and Computing

Journal Article•10.1007/s11222-025-10778-5•

Valid asymptotic inference after sufficient dimension reduction in a single-index framework

[...]

Kyongwon Kim, Jun Song, Jae Keun Yoo

20 Nov 2025-Statistics and Computing

Journal Article•10.1007/s11222-025-10596-9•

Estimation and model selection for finite mixtures of Tukey’s g- &-h distributions

[...]

Tingting Zhan¹, Amy R. Peck², Hallgeir Rui, Inna Chervoneva•Institutions (2)

Thomas Jefferson University¹, Medical College of Wisconsin²

15 Mar 2025-Statistics and Computing

Journal Article•10.1007/s11222-025-10777-6•

Novel Bayesian algorithms for ARFIMA long-memory processes: a comparison between MCMC and ABC approaches

[...]

James Cohen Gabor, Clara Grazian

20 Nov 2025-Statistics and Computing

Journal Article•10.1007/s11222-025-10789-2•

Scalable variational inference for multinomial probit models under large choice sets and sample sizes

[...]

Gyeongjun Kim, Yeseul Kang, Lucas Kock, P. Bansal, Keemin Sohn¹ - Show less +1 more•Institutions (1)

Chung-Ang University¹

15 Jul 2025-Statistics and Computing

TL;DR: This study introduces a scalable conditional variational inference approach for multinomial probit models, using neural embeddings and a reparameterization trick to efficiently estimate model parameters in high-dimensional choice settings with large samples and choice sets.

...read moreread less

Abstract: The multinomial probit (MNP) model is widely used to analyze categorical outcomes due to its ability to capture flexible substitution patterns among alternatives. Conventional likelihood-based and Markov chain Monte Carlo (MCMC) estimators become computationally prohibitive in high-dimensional choice settings. This study introduces a fast and accurate conditional variational inference (CVI) approach to calibrate MNP model parameters, which is scalable to large samples and large choice sets. A flexible variational distribution on correlated latent utilities is defined using neural embeddings, and a reparameterization trick is used to ensure the positive definiteness of the resulting covariance matrix. The resulting CVI estimator is similar to a variational autoencoder, with the variational model being the encoder and the MNP’s data generating process being the decoder. Straight-through-estimation and Gumbel-SoftMax approximation are adopted for the ‘argmax’ operation to select an alternative with the highest latent utility. This eliminates the need to sample from high-dimensional truncated Gaussian distributions, significantly reducing computational costs as the number of alternatives grows. The point estimates from the proposed method align closely with the posterior mean estimates of MCMC. It can calibrate MNP parameters with 20 alternatives and one million observations in approximately 28 minutes - roughly 36 times faster while recovering point estimates with accuracy comparable to the existing benchmarks. Although the proposed approach is primarily designed for efficient point estimation, our experimental results confirm that valid statistical inference can be derived through bootstrapping.

...read moreread less

Journal Article•10.1007/s11222-025-10802-8•

Bayesian inference of longitudinal count data with informative dropouts using a zero-inflated negative binomial mixed model

[...]

Miaojie Xia, Li Guan, Jiang Du

21 Dec 2025-Statistics and Computing

Journal Article•10.1007/s11222-025-10686-8•

Nonparametric empirical bayes prediction in mixed models

[...]

Trambak Banerjee¹, Padma Sharma•Institutions (1)

University of Southern California¹

07 Jul 2025-Statistics and Computing

Journal Article•10.1007/s11222-025-10586-x•

On approximations of subordinators in $$L^p$$ and the simulation of tempered stable distributions

[...]

Michael Grabchak, Sina Saba

21 Feb 2025-Statistics and Computing

TL;DR: This paper approximates subordinators in Lp spaces using scaled Poisson mixtures, deriving a rate of convergence and developing an approach for simulating tempered stable distributions through randomly stopped Lévy processes.

...read moreread less

Abstract: Subordinators are infinitely divisible distributions on the positive half-line. They are often used as mixing distributions in Poisson mixtures. We show that appropriately scaled Poisson mixtures can approximate the mixing subordinator and we derive a rate of convergence in Lp\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L^p$$\end{document} for each p∈[1,∞]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p\in [1,\infty ]$$\end{document}. This includes the Kolmogorov and Wasserstein metrics as special cases. As an application, we develop an approach for approximate simulation of the underlying subordinator. In the interest of generality, we present our results in the context of more general mixtures, specifically those that can be represented as differences of randomly stopped Lévy processes. Particular focus is given to the case where the subordinator belongs to the class of tempered stable distributions.

...read moreread less

Journal Article•10.1007/s11222-025-10677-9•

Robust feature screening via Grothendieck’s correlation with FDR control

[...]

Yunlu Jiang, Xiaowen Huang

03 Jul 2025-Statistics and Computing

Journal Article•10.1007/s11222-025-10608-8•

Fast bayesian variable screening using correlation thresholds

[...]

Roberta Paroli¹, Dimitris Fouskakis², Ioannis Ntzoufras•Institutions (2)

Catholic University of the Sacred Heart¹, National Technical University of Athens²

01 Apr 2025-Statistics and Computing

Journal Article•10.1007/s11222-025-10651-5•

Annealing Strategies for Variance Reduction in Balance Heuristic Estimators

[...]

F. Medina-Aguayo, Richard G. Everitt¹•Institutions (1)

University of Warwick¹

09 Jun 2025-Statistics and Computing

Journal Article•10.1007/s11222-025-10623-9•

High order expansion method for Kuiper’s $$V_n$$ statistic in goodness-of-fit test

[...]

Hong-Yan Zhang¹, Zhi-Qiang Feng, Yu Zhou•Institutions (1)

Hainan Normal University¹

09 May 2025-Statistics and Computing

Abstract: Kuiper’s Vn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$V_n$$\end{document} statistic, a measure for comparing the difference of ideal distribution and empirical distribution, is of great significance in the goodness-of-fit test. However, Kuiper’s formulae for computing the cumulative distribution function, false positive probability, and the upper tail quantile of Vn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$V_n$$\end{document} cannot be applied to the case of small sample capacity n since the approximation error is On-1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}\left( n^{-1}\right) $$\end{document}. In this work, our contributions lie in three perspectives: firstly the approximation error is reduced to On-(k+1)/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}\left( n^{-(k+1)/2}\right) $$\end{document} where k is the expansion order with the high order expansion for the exponent of the differential operator; secondly, a novel high order formula with approximation error On-3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}\left( n^{-3}\right) $$\end{document} is obtained by massive calculations; thirdly, the fixed-point algorithms are designed for solving the Kuiper pair of critical values and upper tail quantiles based on the novel formula. The high order expansion method for Kuiper’s Vn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$V_n$$\end{document} statistic is applicable for various applications where there are more than five samples of data. The principles, algorithms, and code for the high order expansion method are attractive for the goodness-of-fit test.

...read moreread less

...

Expand