Separation (statistics)

Topic Tools

Papers published on a yearly basis

1 / 2

Papers

Journal Article•10.1002/SIM.1047•

A solution to the problem of separation in logistic regression

[...]

Georg Heinze¹, Michael Schemper¹•Institutions (1)

University of Vienna¹

30 Aug 2002-Statistics in Medicine

TL;DR: A procedure by Firth originally developed to reduce the bias of maximum likelihood estimates is shown to provide an ideal solution to separation and produces finite parameter estimates by means of penalized maximum likelihood estimation.

...read moreread less

Abstract: The phenomenon of separation or monotone likelihood is observed in the fitting process of a logistic model if the likelihood converges while at least one parameter estimate diverges to +/- infinity. Separation primarily occurs in small samples with several unbalanced and highly predictive risk factors. A procedure by Firth originally developed to reduce the bias of maximum likelihood estimates is shown to provide an ideal solution to separation. It produces finite parameter estimates by means of penalized maximum likelihood estimation. Corresponding Wald tests and confidence intervals are available but it is shown that penalized likelihood ratio tests and profile penalized likelihood confidence intervals are often preferable. The clear advantage of the procedure over previous options of analysis is impressively demonstrated by the statistical analysis of two cancer studies.

...read moreread less

2,026 citations

Journal Article•10.1214/08-AOAS191•

A weakly informative default prior distribution for logistic and other regression models

[...]

Andrew Gelman¹, Aleks Jakulin, Maria Grazia Pittau, Yu-Sung Su•Institutions (1)

Columbia University¹

01 Dec 2008-The Annals of Applied Statistics

TL;DR: In this paper, the authors propose a new prior distribution for logistic regression models, called Cauchy prior, constructed by first scaling all nonbinary variables to have mean 0 and standard deviation 0.5, and then placing independent Student-t prior distributions on the coefficients.

...read moreread less

Abstract: We propose a new prior distribution for classical (nonhierarchical) logistic regression models, constructed by first scaling all nonbinary variables to have mean 0 and standard deviation 0.5, and then placing independent Student-t prior distributions on the coefficients. As a default choice, we recommend the Cauchy distribution with center 0 and scale 2.5, which in the simplest setting is a longer-tailed version of the distribution attained by assuming one-half additional success and one-half additional failure in a logistic regression. Cross-validation on a corpus of datasets shows the Cauchy class of prior distributions to outperform existing implementations of Gaussian and Laplace priors. We recommend this prior distribution as a default choice for routine applied use. It has the advantage of always giving answers, even when there is complete separation in logistic regression (a common problem, even when the sample size is large and the number of predictors is small), and also automatically applying more shrinkage to higher-order interactions. This can be useful in routine data analysis as well as in automated procedures such as chained equations for missing-data imputation. We implement a procedure to fit generalized linear models in R with the Student-t prior distribution by incorporating an approximate EM algorithm into the usual iteratively weighted least squares. We illustrate with several applications, including a series of logistic regressions predicting voting preferences, a small bioassay experiment, and an imputation model for a public health data set.

...read moreread less

1,809 citations

Journal Article•10.1214/08-AOAS191•

A weakly informative default prior distribution for logistic and other regression models

[...]

Andrew Gelman¹, Aleks Jakulin, Maria Grazia Pittau, Yu-Sung Su•Institutions (1)

Columbia University¹

26 Jan 2009-arXiv: Applications

TL;DR: In this article, the authors propose a new prior distribution for classical logistic regression models, constructed by first scaling all nonbinary variables to have mean 0 and standard deviation 0.5, and then placing independent Student-$t$ prior distributions on the coefficients.

...read moreread less

Abstract: We propose a new prior distribution for classical (nonhierarchical) logistic regression models, constructed by first scaling all nonbinary variables to have mean 0 and standard deviation 0.5, and then placing independent Student-$t$ prior distributions on the coefficients. As a default choice, we recommend the Cauchy distribution with center 0 and scale 2.5, which in the simplest setting is a longer-tailed version of the distribution attained by assuming one-half additional success and one-half additional failure in a logistic regression. Cross-validation on a corpus of datasets shows the Cauchy class of prior distributions to outperform existing implementations of Gaussian and Laplace priors. We recommend this prior distribution as a default choice for routine applied use. It has the advantage of always giving answers, even when there is complete separation in logistic regression (a common problem, even when the sample size is large and the number of predictors is small), and also automatically applying more shrinkage to higher-order interactions. This can be useful in routine data analysis as well as in automated procedures such as chained equations for missing-data imputation. We implement a procedure to fit generalized linear models in R with the Student-$t$ prior distribution by incorporating an approximate EM algorithm into the usual iteratively weighted least squares. We illustrate with several applications, including a series of logistic regressions predicting voting preferences, a small bioassay experiment, and an imputation model for a public health data set.

...read moreread less

1,465 citations

Journal Article•10.1093/BIOMET/71.1.1•

On the existence of maximum likelihood estimates in logistic regression models

[...]

Adelin Albert¹, Ja. Anderson¹•Institutions (1)

University of Liège¹

01 Apr 1984-Biometrika

TL;DR: For multinomial logistic regression models, this article proved existence theorems by considering the possible patterns of data points, which fall into three mutually exclusive and exhaustive categories: complete separation, quasicomplete separation and overlap.

...read moreread less

Abstract: SUMMARY The problems of existence, uniqueness and location of maximum likelihood estimates in log linear models have received special attention in the literature (Haberman, 1974, Chapter 2; Wedderburn, 1976; Silvapulle, 1981). For multinomial logistic regression models, we prove existence theorems by considering the possible patterns of data points, which fall into three mutually exclusive and exhaustive categories: complete separation, quasicomplete separation and overlap. Our results suggest general rules for identifying infinite parameter estimates in log linear models for frequency tables.

...read moreread less

1,160 citations

The Locator/ID Separation Protocol (LISP)

[...]

Darrel Lewis, Vince Fuller, Dino Farinacci, Dave Meyer

1 Jan 2013

TL;DR: Locator/ID Separation Protocol (LISP) as discussed by the authors is a network-layer-based protocol that enables the separation of IP addresses into two new numbering spaces: EndpointIdentifiers (EIDs) and Routing Locators (RLOCs).

...read moreread less

Abstract: This document describes a network-layer-based protocol that enables separation of IP addresses into two new numbering spaces: Endpoint Identifiers (EIDs) and Routing Locators (RLOCs). No changes are required to either host protocol stacks or to the "core" of the Internet infrastructure. The Locator/ID Separation Protocol (LISP) can be incrementally deployed, without a "flag day", and offers Traffic Engineering, multihoming, and mobility benefits to early adopters, even when there are relatively few LISP-capable sites. Design and development of LISP was largely motivated by the problem statement produced by the October 2006 IAB Routing and Addressing Workshop. This document defines an Experimental Protocol for the Internet community.

...read moreread less

607 citations

...

Expand

Year	Papers
2026	2
2025	1,696
2024	2,825
2023	1,458
2022	1,791
2021	127

Topic Tools

Papers published on a yearly basis

Papers

A solution to the problem of separation in logistic regression

A weakly informative default prior distribution for logistic and other regression models

A weakly informative default prior distribution for logistic and other regression models

On the existence of maximum likelihood estimates in logistic regression models

The Locator/ID Separation Protocol (LISP)

Performance Metrics