On the Distribution of the Two-Sample Cramer-von Mises Criterion

doi:10.1214/AOMS/1177704477

Open AccessJournal Article10.1214/AOMS/1177704477

On the Distribution of the Two-Sample Cramer-von Mises Criterion

T. W. Anderson

- 01 Sep 1962

- Annals of Mathematical Statistics

- Vol. 33, Iss: 3, pp 1148-1159

623

TL;DR: The Cramer-von Mises criterion for testing whether a sample is drawn from a specified continuous distribution was introduced in this paper. But it is not known whether the criterion can be applied to the case of two samples.

Abstract: The Cramer-von Mises $\omega^2$ criterion for testing that a sample, $x_1, \cdots, x_N$, has been drawn from a specified continuous distribution $F(x)$ is \begin{equation*}\tag{1}\omega^2 = \int^\infty_{-\infty} \lbrack F_N(x) - F(x)\rbrack^2 dF(x),\end{equation*} where $F_N(x)$ is the empirical distribution function of the sample; that is, $F_N(x) = k/N$ if exactly $k$ observations are less than or equal to $x(k = 0, 1, \cdots, N)$. If there is a second sample, $y_1, \cdots, y_M$, a test of the hypothesis that the two samples come from the same (unspecified) continuous distribution can be based on the analogue of $N\omega^2$, namely \begin{equation*}\tag{2} T = \lbrack NM/(N + M)\rbrack \int^\infty_{-\infty} \lbrack F_N(x) - G_M(x)\rbrack^2 dH_{N+M}(x),\end{equation*} where $G_M(x)$ is the empirical distribution function of the second sample and $H_{N+M}(x)$ is the empirical distribution function of the two samples together [that is, $(N + M)H_{N+M}(x) = NF_N(x) + MG_M(x)\rbrack$. The limiting distribution of $N\omega^2$ as $N \rightarrow \infty$ has been tabulated [2], and it has been shown ([3], [4a], and [7]) that $T$ has the same limiting distribution as $N \rightarrow \infty, M \rightarrow \infty$, and $N/M \rightarrow \lambda$, where $\lambda$ is any finite positive constant. In this note we consider the distribution of $T$ for small values of $N$ and $M$ and present tables to permit use of the criterion at some conventional significance levels for small values of $N$ and $M$. The limiting distribution seems a surprisingly good approximation to the exact distribution for moderate sample sizes (corresponding to the same feature for $N\omega^2$ [6]). The accuracy of approximation is better than in the case of the two-sample Kolmogorov-Smirnov statistic studied by Hodges [4].

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.21314/JOR.2018.404

The implicit constraints of Fundamental Review of the Trading Book profit-and-loss-attribution testing and a possible alternative framework

Alessandro Pogliani, +2 more

- 27 Mar 2019

- Journal of Risk

TL;DR: In this article, the authors highlight the very strong, implicit constraints embedded in PLA and the generally low probability of conducting a successful PLA test; these results support industry concerns related to the proposed regulatory requirements.

...read moreread less

3

Book Chapter•10.1007/978-3-319-09259-1_10

Consensus of Clusterings Based on High-Order Dissimilarities

Helena Aidos, +1 more

- 01 Jan 2015

TL;DR: A DID-based algorithm builds upon an initial data partition, different initializations producing different data partitions, and a validation criterion based on DID is presented to select the best final partition, consisting in the estimation of graph probabilities for each cluster based on the DID.

...read moreread less

3

•Journal Article•10.3390/jlpea13010022

Extreme Path Delay Estimation of Critical Paths in Within-Die Process Fluctuations Using Multi-Parameter Distributions

Miikka Runolinna, +4 more

- 20 Mar 2023

- Journal of Low Power Electronics and App...

TL;DR: In this article , two multi-parameter distributions, namely the Pearson type IV and metalog distributions, are discussed and suggested as alternatives to the normal distribution for modelling path delay data that determines the maximum clock frequency (FMAX) of a microprocessor or other digital circuit.

...read moreread less

3

Predictive modeling using sparse logistic regression with applications

Tapio Manninen

- 31 Jan 2014

TL;DR: It is shown that a combination of a careful model assessment scheme and automatic feature selection by means of logistic regression model and coefficient regularization create a powerful, yet simple and practical, tool chain for applications of supervised learning and classification.

...read moreread less

3

•Posted Content

Financial interaction analysis using best-fitted probability distribution

Vincent Ang

- 01 Jan 2015

- Research Papers in Economics

TL;DR: This article used Monte Carlo simulation on the derived distributions to generate values and impute them into a model or formula that defines the interaction between the variables, obtaining the outcome of their interactions.

...read moreread less

3

...

Expand

On the Distribution of the Two-Sample Cramer-von Mises Criterion

Chat with Paper

AI Agents for this Paper

Citations

The implicit constraints of Fundamental Review of the Trading Book profit-and-loss-attribution testing and a possible alternative framework

Consensus of Clusterings Based on High-Order Dissimilarities

Extreme Path Delay Estimation of Critical Paths in Within-Die Process Fluctuations Using Multi-Parameter Distributions

Predictive modeling using sparse logistic regression with applications

Financial interaction analysis using best-fitted probability distribution

Related Papers (5)

Asymptotic Theory of Certain "Goodness of Fit" Criteria Based on Stochastic Processes

The Kolmogorov-Smirnov Test for Goodness of Fit

EDF Statistics for Goodness of Fit and Some Comparisons

On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other

An Analysis of Variance Test for Normality (Complete Samples)