Scoring rule

Topic Tools

Papers published on a yearly basis

1 / 2

Papers

Journal Article•10.1198/016214506000001437•

Strictly Proper Scoring Rules, Prediction, and Estimation

[...]

Tilmann Gneiting, Adrian E. Raftery

01 Sep 2004-Journal of the American Statistical Association

TL;DR: The theory of proper scoring rules on general probability spaces is reviewed and developed, and the intuitively appealing interval score is proposed as a utility function in interval estimation that addresses width as well as coverage.

...read moreread less

Abstract: Scoring rules assess the quality of probabilistic forecasts, by assigning a numerical score based on the predictive distribution and on the event or value that materializes. A scoring rule is proper if the forecaster maximizes the expected score for an observation drawn from the distributionF if he or she issues the probabilistic forecast F, rather than G ≠ F. It is strictly proper if the maximum is unique. In prediction problems, proper scoring rules encourage the forecaster to make careful assessments and to be honest. In estimation problems, strictly proper scoring rules provide attractive loss and utility functions that can be tailored to the problem at hand. This article reviews and develops the theory of proper scoring rules on general probability spaces, and proposes and discusses examples thereof. Proper scoring rules derive from convex functions and relate to information measures, entropy functions, and Bregman divergences. In the case of categorical variables, we prove a rigorous version of the ...

...read moreread less

5,875 citations

Journal Article•10.1111/J.1467-9868.2007.00587.X•

Probabilistic forecasts, calibration and sharpness

[...]

Tilmann Gneiting¹, Fadoua Balabdaoui², Adrian E. Raftery¹•Institutions (2)

University of Washington¹, University of Göttingen²

01 May 2005-Journal of The Royal Statistical Society Series B-statistical Methodology

TL;DR: In this paper, a diagnostic approach to the evaluation of predictive performance that is based on the paradigm of maximizing the sharpness of the predictive distributions subject to calibration is proposed, which is illustrated by an assessment and ranking of probabilistic forecasts of wind speed at the Stateline wind energy centre in the US Pacific Northwest.

...read moreread less

Abstract: Summary. Probabilistic forecasts of continuous variables take the form of predictive densities or predictive cumulative distribution functions. We propose a diagnostic approach to the evaluation of predictive performance that is based on the paradigm of maximizing the sharpness of the predictive distributions subject to calibration. Calibration refers to the statistical consistency between the distributional forecasts and the observations and is a joint property of the predictions and the events that materialize. Sharpness refers to the concentration of the predictive distributions and is a property of the forecasts only. A simple theoretical framework allows us to distinguish between probabilistic calibration, exceedance calibration and marginal calibration. We propose and study tools for checking calibration and sharpness, among them the probability integral transform histogram, marginal calibration plots, the sharpness diagram and proper scoring rules. The diagnostic approach is illustrated by an assessment and ranking of probabilistic forecasts of wind speed at the Stateline wind energy centre in the US Pacific Northwest. In combination with cross-validation or in the time series context, our proposal provides very general, nonparametric alternatives to the use of information criteria for model diagnostics and model selection.

...read moreread less

1,950 citations

Journal Article•10.1080/01621459.1971.10482346•

Elicitation of Personal Probabilities and Expectations

[...]

Leonard J. Savage

01 Dec 1971-Journal of the American Statistical Association

TL;DR: Proper scoring rules, i.e., devices of a certain class for eliciting a person's probabilities and other expectations, are studied, mainly theoretically but with some speculations about application as discussed by the authors.

...read moreread less

Abstract: Proper scoring rules, i.e., devices of a certain class for eliciting a person's probabilities and other expectations, are studied, mainly theoretically but with some speculations about application. The relation of proper scoring rules to other economic devices and to the foundations of the personalistic theory of probability is brought out. The implications of various restrictions, especially symmetry restrictions, on scoring rules is explored, usually with a minimum of regularity hypothesis.

...read moreread less

1,296 citations

Journal Article•10.1198/JASA.2011.R10138•

Making and Evaluating Point Forecasts

[...]

Tilmann Gneiting

01 Jun 2011-Journal of the American Statistical Association

TL;DR: In this paper, the authors demonstrate that this common practice can lead to grossly misguided inferences, unless the scoring function and the forecasting task are carefully matched, and demonstrate that point forecasting methods are compared by means of an error measure or scoring function, with the absolute error and the squared error being key examples.

...read moreread less

Abstract: Typically, point forecasting methods are compared and assessed by means of an error measure or scoring function, with the absolute error and the squared error being key examples. The individual scores are averaged over forecast cases, to result in a summary measure of the predictive performance, such as the mean absolute error or the mean squared error. I demonstrate that this common practice can lead to grossly misguided inferences, unless the scoring function and the forecasting task are carefully matched. Effective point forecasting requires that the scoring function be specified ex ante, or that the forecaster receives a directive in the form of a statistical functional, such as the mean or a quantile of the predictive distribution. If the scoring function is specified ex ante, the forecaster can issue the optimal point forecast, namely, the Bayes rule. If the forecaster receives a directive in the form of a functional, it is critical that the scoring function be consistent for it, in the sense that t...

...read moreread less

1,213 citations

Journal Article•10.2307/2555752•

Design competition through multidimensional auctions

[...]

Yeon-Koo Che

24 Jan 1993-The RAND Journal of Economics

TL;DR: In this article, the authors developed a model of two-dimensional auctions, where firms bid on both price and quality, and bids are evaluated by a scoring rule designed by a buyer.

...read moreread less

Abstract: This article studies design competition in government procurement by developing a model of two-dimensional auctions, wherefirms bid on both price and quality, and bids are evaluated by a scoring rule designed by a buyer Three auction schemes-first score, second score, and second preferred offer-are introduced and related to actual practices If the buyer can commit to a scoring rule in his best interest, -the resulting optimal scoring rule underrewards quality relative to the buyer's utility function and implements the optimal outcome for the buyer underfirst- and second-score auctions Absent the commitment power, the onlyfeasible scoring rule is the buyer's utility function, under which (1) all three schemes yield the same expected utility to the buyer, and (2) first- and second-score auctions induce the first-best level of quality, which turns out to be excessive from the buyer's point of view

...read moreread less

1,028 citations

...

Expand

Year	Papers
2025	2
2024	5
2023	24
2022	35
2021	48
2020	24

Topic Tools

Papers published on a yearly basis

Papers

Strictly Proper Scoring Rules, Prediction, and Estimation

Probabilistic forecasts, calibration and sharpness

Elicitation of Personal Probabilities and Expectations

Making and Evaluating Point Forecasts

Design competition through multidimensional auctions

Related Topics (5)

Performance Metrics