Top 47 Statistical Modelling papers published in 2021

Showing papers in "Statistical Modelling in 2021"

Journal Article•10.1177/1471082X19890935•

Multiple scaled contaminated normal distribution and its application in clustering

[...]

Antonio Punzo¹, Cristina Tortora²•Institutions (2)

University of Catania¹, San Jose State University²

01 Aug 2021-Statistical Modelling

TL;DR: The multivariate contaminated normal (MCN) distribution as discussed by the authors represents a simple heavy-tailed generalization of the multivariate normal (MN) distribution to model elliptical contoured scatters.

...read moreread less

Abstract: The multivariate contaminated normal (MCN) distribution represents a simple heavy-tailed generalization of the multivariate normal (MN) distribution to model elliptical contoured scatters i...

...read moreread less

27 citations

Journal Article•10.1177/1471082X20929881•

Predicting match outcomes in association football using team ratings and player ratings

[...]

Halvard Arntzen¹, Lars Magnus Hvattum¹•Institutions (1)

Molde University College¹

01 Oct 2021-Statistical Modelling

TL;DR: In this article, the performance of team ratings and individual player ratings when trying to forecast match outcomes in association football was compared, and the main goal of this article was to compare the performance with the well-known Elo rating.

...read moreread less

Abstract: The main goal of this article is to compare the performance of team ratings and individual player ratings when trying to forecast match outcomes in association football. The well-known Elo rating s...

...read moreread less

21 citations

Journal Article•10.1177/1471082X20916088•

Sequential Monte Carlo methods in Bayesian joint models for longitudinal and time-to-event data:

[...]

Danilo Alvares¹, Carmen Armero², Anabel Forte², Nicolas Chopin³•Institutions (3)

Pontifical Catholic University of Chile¹, University of Valencia², ENSAE ParisTech³

01 Feb 2021-Statistical Modelling

TL;DR: This thesis proposes the use of sequential Monte Carlo methods for static parameter joint models in order to update the posterior distribution of the parameters, hyperparameters, and random effects with the intention of reducing computation time in each update of the inferential process.

...read moreread less

Abstract: The statistical analysis of the information generated by medical follow-up is a very important challenge in the field of personalized medicine. As the evolutionary course of a patient's disease pro...

...read moreread less

14 citations

Journal Article•10.1177/1471082X20944620•

Joint modelling of longitudinal and survival data in the presence of competing risks with applications to prostate cancer data.

[...]

Tuhin Sheikh¹, Joseph G. Ibrahim², Jonathan Gelfond³, Wei Sun⁴, Ming-Hui Chen¹ - Show less +1 more•Institutions (4)

University of Connecticut¹, University of North Carolina at Chapel Hill², University of Texas at Austin³, Fred Hutchinson Cancer Research Center⁴

01 Feb 2021-Statistical Modelling

TL;DR: A joint model for simultaneously analysing longitudinal and time-to-event data in the presence of multiple causes of failure and implementing an efficient Markov chain Monte Carlo sampling algorithm to sample from the posterior distribution via a novel reparameterization technique is proposed within the Bayesian framework.

...read moreread less

Abstract: This research is motivated from the data from a large Selenium and Vitamin E Cancer Prevention Trial (SELECT). The prostate specific antigens (PSAs) were collected longitudinally, and the survival endpoint was the time to low-grade cancer or the time to high-grade cancer (competing risks). In this article, the goal is to model the longitudinal PSA data and the time-to-prostate cancer (PC) due to low- or high-grade. We consider the low-grade and high-grade as two competing causes of developing PC. A joint model for simultaneously analysing longitudinal and time-to-event data in the presence of multiple causes of failure (or competing risk) is proposed within the Bayesian framework. The proposed model allows for handling the missing causes of failure in the SELECT data and implementing an efficient Markov chain Monte Carlo sampling algorithm to sample from the posterior distribution via a novel reparameterization technique. Bayesian criteria, ΔDICSurv, and ΔWAICSurv, are introduced to quantify the gain in fit in the survival sub-model due to the inclusion of longitudinal data. A simulation study is conducted to examine the empirical performance of the posterior estimates as well as ΔDICSurv and ΔWAICSurv and a detailed analysis of the SELECT data is also carried out to further demonstrate the proposed methodology.

...read moreread less

12 citations

Journal Article•10.1177/1471082X20917586•

Boosting functional response models for location, scale and shape with an application to bacterial competition:

[...]

Almond Stöcker¹, Sarah Brockhaus², Sophia Anna Schaffer², Benedikt von Bronk², Madeleine Opitz², Sonja Greven¹ - Show less +2 more•Institutions (2)

Humboldt University of Berlin¹, Ludwig Maximilian University of Munich²

01 Oct 2021-Statistical Modelling

TL;DR: The model is fitted via gradient boosting, which offers inherent model selection and is shown to be suitable for both complex model structures and highly auto-correlated response curves, and enables to analyse bacterial growth in Escherichia coli in a complex interaction scenario, fruitfully extending usual growth models.

...read moreread less

Abstract: We extend generalized additive models for location, scale and shape (GAMLSS) to regression with functional response. This allows us to simultaneously model point-wise mean curves, variances...

...read moreread less

11 citations

Journal Article•10.1177/1471082X211007308•

Interactively visualizing distributional regression models with distreg.vis

[...]

Stanislaus Stadlmann¹, Thomas Kneib¹•Institutions (1)

University of Göttingen¹

27 May 2021-Statistical Modelling

TL;DR: A newly emerging field in statistics is distributional regression, where not only the mean but each parameter of a parametric response distribution can be modelled using a set of predictors.

...read moreread less

Abstract: A newly emerging field in statistics is distributional regression, where not only the mean but each parameter of a parametric response distribution can be modelled using a set of predictors. As an ...

...read moreread less

10 citations

Journal Article•10.1177/1471082X211008014•

A regularized hidden Markov model for analyzing the ‘hot shoe’ in football:

[...]

Marius Ötting¹, Groll Andreas²•Institutions (2)

Bielefeld University¹, Technical University of Dortmund²

19 May 2021-Statistical Modelling

TL;DR: This work investigates the potential occurrence of a 'hot shoe' effect for the performance of penalty takers in football based on data from the German Bundesliga, and considers hidden Markov models (HMMs) to model the (latent) forms of players.

...read moreread less

Abstract: We propose a penalized likelihood approach in hidden Markov models (HMMs) to perform automated variable selection. To account for a potential large number of covariates, which also may be substanti...

...read moreread less

9 citations

Journal Article•10.1177/1471082X211041033•

Canonical correlation analysis in high dimensions with structured regularization

[...]

Elena Tuzhilina, Leonardo Tozzi, Trevor Hastie

03 Oct 2021-Statistical Modelling

TL;DR: Canonical correlation analysis (CCA) as discussed by the authors is a technique for measuring the association between two multivariate data matrices, which is a regularized modification of RCCA.

...read moreread less

Abstract: Canonical correlation analysis (CCA) is a technique for measuring the association between two multivariate data matrices. A regularized modification of canonical correlation analysis (RCCA) which i...

...read moreread less

8 citations

Journal Article•10.1177/1471082X20940156•

Quantile foliation for modelling performance across body mass and age in Olympic weightlifting

[...]

Aris Perperoglou¹, Marianne Huebner²•Institutions (2)

Newcastle University¹, Michigan State University²

01 Dec 2021-Statistical Modelling

TL;DR: In this paper, the authors developed a quantile foliation model to predict outcomes for one explanatory variable based on two covariates and varying quantiles, which is an extension of quantile sheets.

...read moreread less

Abstract: In this work, we develop ‘quantile foliation’ to predict outcomes for one explanatory variable based on two covariates and varying quantiles. This is an extension of quantile sheets.Data from World...

...read moreread less

8 citations

Journal Article•10.1177/1471082X20930894•

Streamlined Variational Inference for Higher Level Group-Specific Curve Models

[...]

Marianne Menictas¹, T. H. Nolan¹, T. H. Nolan², Douglas G. Simpson³, Matt P. Wand², Matt P. Wand¹ - Show less +2 more•Institutions (3)

University of Technology, Sydney¹, University of Melbourne², University of Illinois at Urbana–Champaign³

02 Nov 2021-Statistical Modelling

TL;DR: The pattern established by the systematic approach sheds light on what is required for even higher level group-specific curve models by systematically working through two-level and then three-level cases.

...read moreread less

Abstract: A two-level group-specific curve model is such that the mean response of each member of a group is a separate smooth function of a predictor of interest. The three-level extension is such that one ...

...read moreread less

7 citations

Journal Article•10.1177/1471082X211048660•

Robust clustering based on finite mixture of multivariate fragmental distributions

[...]

Mohsen Maleki, Geoffrey J. McLachlan, Sharon X. Lee

01 Jan 2021-Statistical Modelling

TL;DR: In this article, a flexible class of multivariate distributions called scale mixtures of fragmental normal (SMFN) distributions is introduced and its extension to the case of a finite mixture of SMFN distributions is also proposed.

...read moreread less

Abstract: A flexible class of multivariate distributions called scale mixtures of fragmental normal (SMFN) distributions, is introduced. Its extension to the case of a finite mixture of SMFN (FM-SMFN) distributions is also proposed. The SMFN family of distributions is convenient and effective for modelling data with skewness, discrepant observations and population heterogeneity. It also possesses some other desirable properties, including an analytically tractable density and ease of computation for simulation and estimation of parameters. A stochastic representation of the SMFN distribution is given and then a hierarchical representation is described, the latter aids in parameter estimation, derivation of statistical properties and simulations. Maximum likelihood estimation of the FM-SMFN distribution via the expectation–maximization (EM) algorithm is outlined before the clustering performance of the proposed mixture model is illustrated using simulated and real datasets. In particular, the ability of FM-SMFN distributions to model data generated from various well-known families is demonstrated.

...read moreread less

Journal Article•10.1177/1471082X20936017•

Poisson-Tweedie mixed-effects model: a flexible approach for the analysis of longitudinal RNA-seq data

[...]

Mirko Signorelli¹, Pietro Spitali¹, Roula Tsonaka¹•Institutions (1)

Leiden University Medical Center¹

01 Dec 2021-Statistical Modelling

TL;DR: A generalized linear mixed model based on the Poisson–Tweedie distribution that can flexibly handle each of the aforementioned features of longitudinal overdispersed counts is proposed.

...read moreread less

Abstract: We present a new modelling approach for longitudinal overdispersed counts that is motivated by the increasing availability of longitudinal RNA-sequencing experiments. The distribution of RNA-seq co...

...read moreread less

Journal Article•10.1177/1471082X211059233•

Bayesian analysis of two-part nonlinear latent variable model: Semiparametric method

[...]

Jian-Wei Gou¹, Ye-Mao Xia¹, De-Peng Jiang²•Institutions (2)

Nanjing Forestry University¹, University of Manitoba²

24 Nov 2021-Statistical Modelling

Journal Article•10.1177/1471082X211043946•

Outlier accommodation with semiparametric density processes: A study of Antarctic snow density modelling:

[...]

Daniel Sheanshang¹, Philip A. White¹, Durban G. Keeler²•Institutions (2)

Brigham Young University¹, University of Utah²

29 Sep 2021-Statistical Modelling

TL;DR: A two-component mixture model using a physically motivated snow density model and an outlier model, both of which evolve over depth, which outperforms alternatives and can be used for various inferential tasks.

...read moreread less

Abstract: In many settings, data acquisition generates outliers that can obscure inference. Therefore, practitioners often either identify and remove outliers or accommodate outliers using robust models. How...

...read moreread less

Journal Article•10.1177/1471082X19896867•

A general framework for prediction in penalized regression

[...]

Alba Carballo¹, María Durbán¹, Göran Kauermann², Dae-Jin Lee³•Institutions (3)

Charles III University of Madrid¹, Ludwig Maximilian University of Munich², Basque Center for Applied Mathematics³

01 Aug 2021-Statistical Modelling

TL;DR: In this article, two main approaches to carry out prediction in the context of penalized regression are proposed: with low-rank basis and penalties or through the smooth mixed models, respectively.

...read moreread less

Abstract: There are two main approaches to carrying out prediction in the context of penalized regression: with low-rank basis and penalties or through the smooth mixed models In this article, we gi

...read moreread less

Journal Article•10.1177/1471082X20927114•

An alternative characterization of MAR in shared parameter models for incomplete longitudinal data and its utilization for sensitivity analysis

[...]

Grigorios Papageorgiou¹, Dimitris Rizopoulos¹•Institutions (1)

Erasmus University Rotterdam¹

01 Feb 2021-Statistical Modelling

TL;DR: This article proposes an alternative characterization of MAR by exploiting the conditional independence assumption, under which outcome and missingness are independent given a set of random effects and offers flexibility over the assumption for the missing data generating mechanism that governs dropout by allowing subject-specific perturbations of the censoring distribution.

...read moreread less

Abstract: Dropout is a common complication in longitudinal studies, especially since the distinction between missing not at random (MNAR) and missing at random (MAR) dropout is intractable. Consequen...

...read moreread less

Journal Article•10.1177/1471082X211022980•

Principal component regression in GAMLSS applied to Greek–German government bond yield spreads:

[...]

D. Stasinopoulos Mikis¹, A. Rigby Robert¹, Georgikopoulos Nikolaos², De Bastiani Fernanda³•Institutions (3)

London Metropolitan University¹, New York University², Federal University of Pernambuco³

21 Jun 2021-Statistical Modelling

TL;DR: A solution to the problem of having to deal with a large number of interrelated explanatory variables within a generalized additive model for location, scale and shape (GAMLSS) is given in this article.

...read moreread less

Abstract: A solution to the problem of having to deal with a large number of interrelated explanatory variables within a generalized additive model for location, scale and shape (GAMLSS) is given here using ...

...read moreread less

Journal Article•10.1177/1471082X211036517•

Parametric estimation of non-crossing quantile functions:

[...]

Gianluca Sottile¹, Paolo Frumento²•Institutions (2)

University of Palermo¹, University of Pisa²

01 Sep 2021-Statistical Modelling

TL;DR: In this article, the authors apply the Quantile Regression (QR) method to the problem of quantification and show that it is a standard method by applied statisticians and practitioners in various fields.

...read moreread less

Abstract: Quantile regression (QR) has gained popularity during the last decades, and is now considered a standard method by applied statisticians and practitioners in various fields. In this work, we applie...

...read moreread less

Journal Article•10.1177/1471082X21993603•

Two-part quantile regression models for semi-continuous longitudinal data: A finite mixture approach:

[...]

Luca Merlo¹, Antonello Maruotti², Antonello Maruotti³, Lea Petrella¹•Institutions (3)

Sapienza University of Rome¹, University of Bergen², Libera Università Maria SS. Assunta³

07 Apr 2021-Statistical Modelling

TL;DR: In this article, a two-part finite mixture quantile regression model for semi-continuous longitudinal data is proposed. But the proposed methodology allows heterogeneity sources that influence the model for t...

...read moreread less

Abstract: This article develops a two-part finite mixture quantile regression model for semi-continuous longitudinal data. The proposed methodology allows heterogeneity sources that influence the model for t...

...read moreread less

Journal Article•10.1177/1471082X211008011•

Bayesian adjustment for measurement error in an offset variable in a Poisson regression model

[...]

Kangjie Zhang, Juxin Liu¹, Yang Liu, Peng Zhang², Raymond J. Carroll³, Raymond J. Carroll⁴ - Show less +2 more•Institutions (4)

University of Saskatchewan¹, Zhejiang University², University of Technology, Sydney³, Texas A&M University⁴

24 May 2021-Statistical Modelling

TL;DR: The Graduated Driver Licensing programme is one effective policy for reducing the number of teen fatal car crashes in the USA.

...read moreread less

Abstract: Fatal car crashes are the leading cause of death among teenagers in the USA. The Graduated Driver Licensing (GDL) programme is one effective policy for reducing the number of teen fatal car crashes...

...read moreread less

Journal Article•10.1177/1471082X20943312•

Modelling multiple outcomes in repeated measures studies: Comparing aesthetic eyelid surgery techniques:

[...]

Wagner Hugo Bonat, Ricardo Rasmussen Petterle, Priscilla Balbinot, Alexandre Elias Contin Mansur, Ruth Graf - Show less +1 more

01 Dec 2021-Statistical Modelling

TL;DR: The presented multivariate model provided a better fit than its univariate counterpart and showed that the three surgery techniques tend to increase all considered outcomes in a long-term perspective, that is, from preoperative to 10 years postoperative evaluations.

...read moreread less

Abstract: We propose a multivariate regression model to deal with multiple outcomes along with repeated measures in the context of longitudinal data analysis. Our model allows for flexible and interpretable ...

...read moreread less

Journal Article•10.1177/1471082X20967158•

Spatial survival modelling of business re-opening after Katrina: Survival modelling compared to spatial probit modelling of re-opening within 3, 6 or 12 months:

[...]

Roger Bivand¹, Virgilio Gómez-Rubio²•Institutions (2)

Norwegian School of Economics¹, University of Castilla–La Mancha²

01 Feb 2021-Statistical Modelling

TL;DR: Zhou and Hanson as discussed by the authors proposed a nonparametric Bayesian Inference in Biostatistics (NBIBIN) method, which is based on Bayesian inference in the context of data mining.

...read moreread less

Abstract: Zhou and Hanson; Zhou and Hanson; Zhou and Hanson (2015, Nonparametric Bayesian Inference in Biostatistics, pages 215–46. Cham: Springer; 2018, Journal of the American Statistical Association, 113,...

...read moreread less

Journal Article•10.1177/1471082X211015452•

Alleviating confounding in spatio-temporal areal models with an application on crimes against women in India

[...]

Aritz Adin¹, Tomás Goicoa¹, James S. Hodges², Patrick Schnell³, María Dolores Ugarte¹ - Show less +1 more•Institutions (3)

Universidad Pública de Navarra¹, University of Minnesota², Ohio State University³

31 May 2021-Statistical Modelling

TL;DR: In this paper, the authors assess associations between a response of interest and a set of covariates in spatial areal models, and the presence of spatially correlated random variables is considered.

...read moreread less

Abstract: Assessing associations between a response of interest and a set of covariates in spatial areal models is the leitmotiv of ecological regression. However, the presence of spatially correlated random...

...read moreread less

Journal Article•10.1177/1471082X20920972•

Bayesian variable selection and shrinkage strategies in a complicated modelling setting with missing data: A case study using multistate models:

[...]

Lauren J. Beesley¹, Jeremy M. G. Taylor¹•Institutions (1)

University of Michigan¹

01 Feb 2021-Statistical Modelling

TL;DR: This article discusses how to modify and implement several existing Bayesian variable selection and shrinkage methods in a general multistate modelling setting, and compares the performance of these methods in terms of parameter estimation and model selection in a multistates cure model of recurrence and death in patients treated for head and neck cancer.

...read moreread less

Abstract: Multistate modelling is a strategy for jointly modelling related time-to-event outcomes that can handle complicated outcome relationships, has appealing interpretations, can provide insight into di...

...read moreread less

Journal Article•10.1177/1471082X19896688•

Bayesian joint analysis using a semiparametric latent variable model with non-ignorable missing covariates for CHNS data:

[...]

Zhihua Ma¹, Guanghui Chen²•Institutions (2)

Shenzhen University¹, Jinan University²

01 Aug 2021-Statistical Modelling

TL;DR: A semiparametric latent variable model with a Dirichlet process (DP) mixtures prior on the latent variable and a Bayesian index of local sensitivity to non-ignorability (ISNI) is extended to explore the local sensitivity of the parameters in the model.

...read moreread less

Abstract: Motivated by the China Health and Nutrition Survey (CHNS) data, a semiparametric latent variable model with a Dirichlet process (DP) mixtures prior on the latent variable is proposed to joi...

...read moreread less

Journal Article•10.1177/1471082X211015454•

Quantile regression for longitudinal data via the multivariate generalized hyperbolic distribution

[...]

Alvaro J. Flórez¹, Ingrid Van Keilegom², Geert Molenberghs², Anneleen Verhasselt³•Institutions (3)

University of Valle¹, Katholieke Universiteit Leuven², University of Hasselt³

07 Jun 2021-Statistical Modelling

TL;DR: In this article, the authors focus on multivariate quantile regression and propose a multivariate (longitudinal) version, which is the case for the multivariate version of the problem, even though there are many potential app applicability.

...read moreread less

Abstract: While extensive research has been devoted to univariate quantile regression, this is considerably less the case for the multivariate (longitudinal) version, even though there are many potential app...

...read moreread less

Journal Article•10.1177/1471082X211037405•

Random effect models for multivariate mixed data: A Parafac-based finite mixture approach:

[...]

Marco Alfò¹, Paolo Giordani¹•Institutions (1)

Sapienza University of Rome¹

01 Sep 2021-Statistical Modelling

TL;DR: In this article, a flexible regression model for multivariate mixed responses is discussed, where dependencies between outcomes are introduced via the joint distribution of discrete outcome and individual-specific random variables.

...read moreread less

Abstract: We discuss a flexible regression model for multivariate mixed responses. Dependence between outcomes is introduced via the joint distribution of discrete outcome- and individual-specific random eff...

...read moreread less

Journal Article•10.1177/1471082X211049278•

Bayesian clustered coefficients regression with auxiliary covariates assistant random effects

[...]

Guanyu Hu¹, Yishu Xue², Zhihua Ma³•Institutions (3)

University of Missouri¹, University of Connecticut², Shenzhen University³

28 Oct 2021-Statistical Modelling

TL;DR: In this paper, a mixture of finite mixtures (MFM) clustered regression model with auxiliary covariates that account for similarities in demographic or economic characteristics over a spatial domain is proposed.

...read moreread less

Abstract: In regional economics research, a problem of interest is to detect similarities between regions, and estimate their shared coefficients in economics models. In this article, we propose a mixture of finite mixtures (MFM) clustered regression model with auxiliary covariates that account for similarities in demographic or economic characteristics over a spatial domain. Our Bayesian construction provides both inference for number of clusters and clustering configurations, and estimation for parameters for each cluster. Empirical performance of the proposed model is illustrated through simulation experiments, and further applied to a study of influential factors for monthly housing cost in Georgia.

...read moreread less

Journal Article•10.1177/1471082X20981312•

Reflections on Murray Aitkin's contributions to nonparametric mixture models and Bayes factors

[...]

Alan Agresti¹, Francesco Bartolucci², Antonietta Mira³, Antonietta Mira⁴•Institutions (4)

University of Florida¹, University of Perugia², University of Lugano³, University of Insubria⁴

08 Feb 2021-Statistical Modelling

TL;DR: Aitkin this paper described two interesting and innovative strands of Murray Aitkin's research publications dealing with mixture models and with Bayesian inference, both dealing with a mixture model and inference.

...read moreread less

Abstract: We describe two interesting and innovative strands of Murray Aitkin's research publications, dealing with mixture models and with Bayesian inference. Of his considerable publications on mixture mod...

...read moreread less

Journal Article•10.1177/1471082X211020872•

A spatially explicit N-mixture model for the estimation of disease prevalence:

[...]

Ben J Brintz¹, Lisa Madsen², Claudio Fuentes²•Institutions (2)

University of Utah¹, Oregon State University²

20 Jun 2021-Statistical Modelling

TL;DR: An approximate N-mixture model for infectious disease counts that accounts for under-reporting as well as spatial dependence induced by person-to-person spread of disease is developed.

...read moreread less

Abstract: This article develops an approximate N-mixture model for infectious disease counts that accounts for under-reporting as well as spatial dependence induced by person-to-person spread of disease. We ...

...read moreread less