TL;DR: The multivariate contaminated normal (MCN) distribution as discussed by the authors represents a simple heavy-tailed generalization of the multivariate normal (MN) distribution to model elliptical contoured scatters.
Abstract: The multivariate contaminated normal (MCN) distribution represents a simple heavy-tailed generalization of the multivariate normal (MN) distribution to model elliptical contoured scatters i...
TL;DR: In this article, the performance of team ratings and individual player ratings when trying to forecast match outcomes in association football was compared, and the main goal of this article was to compare the performance with the well-known Elo rating.
Abstract: The main goal of this article is to compare the performance of team ratings and individual player ratings when trying to forecast match outcomes in association football. The well-known Elo rating s...
TL;DR: This thesis proposes the use of sequential Monte Carlo methods for static parameter joint models in order to update the posterior distribution of the parameters, hyperparameters, and random effects with the intention of reducing computation time in each update of the inferential process.
Abstract: The statistical analysis of the information generated by medical follow-up is a very important challenge in the field of personalized medicine. As the evolutionary course of a patient's disease pro...
TL;DR: A joint model for simultaneously analysing longitudinal and time-to-event data in the presence of multiple causes of failure and implementing an efficient Markov chain Monte Carlo sampling algorithm to sample from the posterior distribution via a novel reparameterization technique is proposed within the Bayesian framework.
Abstract: This research is motivated from the data from a large Selenium and Vitamin E Cancer Prevention Trial (SELECT). The prostate specific antigens (PSAs) were collected longitudinally, and the survival endpoint was the time to low-grade cancer or the time to high-grade cancer (competing risks). In this article, the goal is to model the longitudinal PSA data and the time-to-prostate cancer (PC) due to low- or high-grade. We consider the low-grade and high-grade as two competing causes of developing PC. A joint model for simultaneously analysing longitudinal and time-to-event data in the presence of multiple causes of failure (or competing risk) is proposed within the Bayesian framework. The proposed model allows for handling the missing causes of failure in the SELECT data and implementing an efficient Markov chain Monte Carlo sampling algorithm to sample from the posterior distribution via a novel reparameterization technique. Bayesian criteria, ΔDICSurv, and ΔWAICSurv, are introduced to quantify the gain in fit in the survival sub-model due to the inclusion of longitudinal data. A simulation study is conducted to examine the empirical performance of the posterior estimates as well as ΔDICSurv and ΔWAICSurv and a detailed analysis of the SELECT data is also carried out to further demonstrate the proposed methodology.
TL;DR: The model is fitted via gradient boosting, which offers inherent model selection and is shown to be suitable for both complex model structures and highly auto-correlated response curves, and enables to analyse bacterial growth in Escherichia coli in a complex interaction scenario, fruitfully extending usual growth models.
Abstract: We extend generalized additive models for location, scale and shape (GAMLSS) to regression with functional response. This allows us to simultaneously model point-wise mean curves, variances...
TL;DR: A newly emerging field in statistics is distributional regression, where not only the mean but each parameter of a parametric response distribution can be modelled using a set of predictors.
Abstract: A newly emerging field in statistics is distributional regression, where not only the mean but each parameter of a parametric response distribution can be modelled using a set of predictors. As an ...
TL;DR: This work investigates the potential occurrence of a 'hot shoe' effect for the performance of penalty takers in football based on data from the German Bundesliga, and considers hidden Markov models (HMMs) to model the (latent) forms of players.
Abstract: We propose a penalized likelihood approach in hidden Markov models (HMMs) to perform automated variable selection. To account for a potential large number of covariates, which also may be substanti...
TL;DR: Canonical correlation analysis (CCA) as discussed by the authors is a technique for measuring the association between two multivariate data matrices, which is a regularized modification of RCCA.
Abstract: Canonical correlation analysis (CCA) is a technique for measuring the association between two multivariate data matrices. A regularized modification of canonical correlation analysis (RCCA) which i...
TL;DR: In this paper, the authors developed a quantile foliation model to predict outcomes for one explanatory variable based on two covariates and varying quantiles, which is an extension of quantile sheets.
Abstract: In this work, we develop ‘quantile foliation’ to predict outcomes for one explanatory variable based on two covariates and varying quantiles. This is an extension of quantile sheets.Data from World...
TL;DR: The pattern established by the systematic approach sheds light on what is required for even higher level group-specific curve models by systematically working through two-level and then three-level cases.
Abstract: A two-level group-specific curve model is such that the mean response of each member of a group is a separate smooth function of a predictor of interest. The three-level extension is such that one ...
TL;DR: In this article, a flexible class of multivariate distributions called scale mixtures of fragmental normal (SMFN) distributions is introduced and its extension to the case of a finite mixture of SMFN distributions is also proposed.
Abstract: A flexible class of multivariate distributions called scale mixtures of fragmental normal (SMFN) distributions, is introduced. Its extension to the case of a finite mixture of SMFN (FM-SMFN) distributions is also proposed. The SMFN family of distributions is convenient and effective for modelling data with skewness, discrepant observations and population heterogeneity. It also possesses some other desirable properties, including an analytically tractable density and ease of computation for simulation and estimation of parameters. A stochastic representation of the SMFN distribution is given and then a hierarchical representation is described, the latter aids in parameter estimation, derivation of statistical properties and simulations. Maximum likelihood estimation of the FM-SMFN distribution via the expectation–maximization (EM) algorithm is outlined before the clustering performance of the proposed mixture model is illustrated using simulated and real datasets. In particular, the ability of FM-SMFN distributions to model data generated from various well-known families is demonstrated.
TL;DR: A generalized linear mixed model based on the Poisson–Tweedie distribution that can flexibly handle each of the aforementioned features of longitudinal overdispersed counts is proposed.
Abstract: We present a new modelling approach for longitudinal overdispersed counts that is motivated by the increasing availability of longitudinal RNA-sequencing experiments. The distribution of RNA-seq co...
TL;DR: A two-component mixture model using a physically motivated snow density model and an outlier model, both of which evolve over depth, which outperforms alternatives and can be used for various inferential tasks.
Abstract: In many settings, data acquisition generates outliers that can obscure inference. Therefore, practitioners often either identify and remove outliers or accommodate outliers using robust models. How...
TL;DR: In this article, two main approaches to carry out prediction in the context of penalized regression are proposed: with low-rank basis and penalties or through the smooth mixed models, respectively.
Abstract: There are two main approaches to carrying out prediction in the context of penalized regression: with low-rank basis and penalties or through the smooth mixed models In this article, we gi
TL;DR: This article proposes an alternative characterization of MAR by exploiting the conditional independence assumption, under which outcome and missingness are independent given a set of random effects and offers flexibility over the assumption for the missing data generating mechanism that governs dropout by allowing subject-specific perturbations of the censoring distribution.
Abstract: Dropout is a common complication in longitudinal studies, especially since the distinction between missing not at random (MNAR) and missing at random (MAR) dropout is intractable. Consequen...
TL;DR: A solution to the problem of having to deal with a large number of interrelated explanatory variables within a generalized additive model for location, scale and shape (GAMLSS) is given in this article.
Abstract: A solution to the problem of having to deal with a large number of interrelated explanatory variables within a generalized additive model for location, scale and shape (GAMLSS) is given here using ...
TL;DR: In this article, the authors apply the Quantile Regression (QR) method to the problem of quantification and show that it is a standard method by applied statisticians and practitioners in various fields.
Abstract: Quantile regression (QR) has gained popularity during the last decades, and is now considered a standard method by applied statisticians and practitioners in various fields. In this work, we applie...
TL;DR: In this article, a two-part finite mixture quantile regression model for semi-continuous longitudinal data is proposed. But the proposed methodology allows heterogeneity sources that influence the model for t...
Abstract: This article develops a two-part finite mixture quantile regression model for semi-continuous longitudinal data. The proposed methodology allows heterogeneity sources that influence the model for t...
TL;DR: The Graduated Driver Licensing programme is one effective policy for reducing the number of teen fatal car crashes in the USA.
Abstract: Fatal car crashes are the leading cause of death among teenagers in the USA. The Graduated Driver Licensing (GDL) programme is one effective policy for reducing the number of teen fatal car crashes...
TL;DR: The presented multivariate model provided a better fit than its univariate counterpart and showed that the three surgery techniques tend to increase all considered outcomes in a long-term perspective, that is, from preoperative to 10 years postoperative evaluations.
Abstract: We propose a multivariate regression model to deal with multiple outcomes along with repeated measures in the context of longitudinal data analysis. Our model allows for flexible and interpretable ...
TL;DR: Zhou and Hanson as discussed by the authors proposed a nonparametric Bayesian Inference in Biostatistics (NBIBIN) method, which is based on Bayesian inference in the context of data mining.
Abstract: Zhou and Hanson; Zhou and Hanson; Zhou and Hanson (2015, Nonparametric Bayesian Inference in Biostatistics, pages 215–46. Cham: Springer; 2018, Journal of the American Statistical Association, 113,...
TL;DR: In this paper, the authors assess associations between a response of interest and a set of covariates in spatial areal models, and the presence of spatially correlated random variables is considered.
Abstract: Assessing associations between a response of interest and a set of covariates in spatial areal models is the leitmotiv of ecological regression. However, the presence of spatially correlated random...
TL;DR: This article discusses how to modify and implement several existing Bayesian variable selection and shrinkage methods in a general multistate modelling setting, and compares the performance of these methods in terms of parameter estimation and model selection in a multistates cure model of recurrence and death in patients treated for head and neck cancer.
Abstract: Multistate modelling is a strategy for jointly modelling related time-to-event outcomes that can handle complicated outcome relationships, has appealing interpretations, can provide insight into di...
TL;DR: A semiparametric latent variable model with a Dirichlet process (DP) mixtures prior on the latent variable and a Bayesian index of local sensitivity to non-ignorability (ISNI) is extended to explore the local sensitivity of the parameters in the model.
Abstract: Motivated by the China Health and Nutrition Survey (CHNS) data, a semiparametric latent variable model with a Dirichlet process (DP) mixtures prior on the latent variable is proposed to joi...
TL;DR: In this article, the authors focus on multivariate quantile regression and propose a multivariate (longitudinal) version, which is the case for the multivariate version of the problem, even though there are many potential app applicability.
Abstract: While extensive research has been devoted to univariate quantile regression, this is considerably less the case for the multivariate (longitudinal) version, even though there are many potential app...
TL;DR: In this article, a flexible regression model for multivariate mixed responses is discussed, where dependencies between outcomes are introduced via the joint distribution of discrete outcome and individual-specific random variables.
Abstract: We discuss a flexible regression model for multivariate mixed responses. Dependence between outcomes is introduced via the joint distribution of discrete outcome- and individual-specific random eff...
TL;DR: In this paper, a mixture of finite mixtures (MFM) clustered regression model with auxiliary covariates that account for similarities in demographic or economic characteristics over a spatial domain is proposed.
Abstract: In regional economics research, a problem of interest is to detect similarities between regions, and estimate their shared coefficients in economics models. In this article, we propose a mixture of finite mixtures (MFM) clustered regression model with auxiliary covariates that account for similarities in demographic or economic characteristics over a spatial domain. Our Bayesian construction provides both inference for number of clusters and clustering configurations, and estimation for parameters for each cluster. Empirical performance of the proposed model is illustrated through simulation experiments, and further applied to a study of influential factors for monthly housing cost in Georgia.
TL;DR: Aitkin this paper described two interesting and innovative strands of Murray Aitkin's research publications dealing with mixture models and with Bayesian inference, both dealing with a mixture model and inference.
Abstract: We describe two interesting and innovative strands of Murray Aitkin's research publications, dealing with mixture models and with Bayesian inference. Of his considerable publications on mixture mod...
TL;DR: An approximate N-mixture model for infectious disease counts that accounts for under-reporting as well as spatial dependence induced by person-to-person spread of disease is developed.
Abstract: This article develops an approximate N-mixture model for infectious disease counts that accounts for under-reporting as well as spatial dependence induced by person-to-person spread of disease. We ...