TL;DR: In the sequel, measure-theoretic details are generally downplayed or ignored in proofs, but the presentation is detailed enough that anyone with a good background in probability should be able to fill in any missing details.
Abstract: Much of the theory of statistical inference can be appreciated without a detailed understanding of probability or measure theory. This book does not treat these topics with rigor. But some basic knowledge of them is quite useful. Much of the literature in statistics uses measure theory and is inaccessible to anyone unfamiliar with the basic notation. Also, the notation of measure theory allows one to merge results for discrete and continuous random variables. In addition, the notation can handle interesting and important applications involving censoring or truncation in which a random variable of interest is neither discrete nor continuous. Finally, the language of measure theory is necessary for stating many results correctly. In the sequel, measure-theoretic details are generally downplayed or ignored in proofs, but the presentation is detailed enough that anyone with a good background in probability should be able to fill in any missing details.
TL;DR: This paper clarifies the mathematical structure of this measure space and its relationship to the underlying spaces associated with each of the basic random variables.
Abstract: The basic random variables on which random uncertainties can in a given model depend can be viewed as defining a measure space with respect to which the solution to the mathematical problem can be defined. This measure space is defined on a product measure associated with the collection of basic random variables. This paper clarifies the mathematical structure of this space and its relationship to the underlying spaces associated with each of the random variables. Cases of both dependent and independent basic random variables are addressed. Bases on the product space are developed that can be viewed as generalizations of the standard polynomial chaos approximation. Moreover, two numerical constructions of approximations in this space are presented along with the associated convergence analysis.
TL;DR: In this paper, an empirical measure of the approximate accuracy of feature importance estimates in deep neural networks is proposed, and it is shown that ensemble based approaches outperform a random assignment of importance.
Abstract: We propose an empirical measure of the approximate accuracy of feature importance estimates in deep neural networks. Our results across several large-scale image classification datasets show that many popular interpretability methods produce estimates of feature importance that are not better than a random designation of feature importance. Only certain ensemble based approaches---VarGrad and SmoothGrad-Squared---outperform such a random assignment of importance. The manner of ensembling remains critical, we show that some approaches do no better then the underlying method but carry a far higher computational burden.
TL;DR: In this paper, the authors present a unified theory of both the level and sensitivity of pay in competitive market equilibrium, by embedding a moral hazard problem into a talent assignment model.
Abstract: This paper presents a unified theory of both the level and sensitivity of pay in competitive market equilibrium, by embedding a moral hazard problem into a talent assignment model. By considering multiplicative specifications for the CEO’s utility and production functions, we generate a number of different results from traditional additive models. First, both the CEO’s low fractional ownership (the Jensen–Murphy incentives measure) and its negative relationship with firm size can be quantitatively reconciled with optimal contracting, and thus need not reflect rent extraction. Second, the dollar change in wealth for a percentage change in firm value, divided by annual pay, is independent of firm size, and therefore a desirable empirical measure of incentives. Third, incentive pay is effective at solving agency problems with multiplicative impacts on firm value, such as strategy choice. However, additive issues such as perk consumption are best addressed through direct monitoring. (JEL D2, D3, G34, J3)
TL;DR: This work considers the fundamental question of how quickly the empirical measure obtained from independent samples from $\mu$ approaches $n$ in the Wasserstein distance of any order and proves sharp asymptotic and finite-sample results for this rate of convergence for general measures on general compact metric spaces.
Abstract: The Wasserstein distance between two probability measures on a metric space
is a measure of closeness with applications in statistics, probability, and
machine learning. In this work, we consider the fundamental question of how
quickly the empirical measure obtained from $n$ independent samples from $\mu$
approaches $\mu$ in the Wasserstein distance of any order. We prove sharp
asymptotic and finite-sample results for this rate of convergence for general
measures on general compact metric spaces. Our finite-sample results show the
existence of multi-scale behavior, where measures can exhibit radically
different rates of convergence as $n$ grows.