On the Distribution of the Two-Sample Cramer-von Mises Criterion
TL;DR: The Cramer-von Mises criterion for testing whether a sample is drawn from a specified continuous distribution was introduced in this paper. But it is not known whether the criterion can be applied to the case of two samples.
read more
Abstract: The Cramer-von Mises $\omega^2$ criterion for testing that a sample, $x_1, \cdots, x_N$, has been drawn from a specified continuous distribution $F(x)$ is \begin{equation*}\tag{1}\omega^2 = \int^\infty_{-\infty} \lbrack F_N(x) - F(x)\rbrack^2 dF(x),\end{equation*} where $F_N(x)$ is the empirical distribution function of the sample; that is, $F_N(x) = k/N$ if exactly $k$ observations are less than or equal to $x(k = 0, 1, \cdots, N)$. If there is a second sample, $y_1, \cdots, y_M$, a test of the hypothesis that the two samples come from the same (unspecified) continuous distribution can be based on the analogue of $N\omega^2$, namely \begin{equation*}\tag{2} T = \lbrack NM/(N + M)\rbrack \int^\infty_{-\infty} \lbrack F_N(x) - G_M(x)\rbrack^2 dH_{N+M}(x),\end{equation*} where $G_M(x)$ is the empirical distribution function of the second sample and $H_{N+M}(x)$ is the empirical distribution function of the two samples together [that is, $(N + M)H_{N+M}(x) = NF_N(x) + MG_M(x)\rbrack$. The limiting distribution of $N\omega^2$ as $N \rightarrow \infty$ has been tabulated [2], and it has been shown ([3], [4a], and [7]) that $T$ has the same limiting distribution as $N \rightarrow \infty, M \rightarrow \infty$, and $N/M \rightarrow \lambda$, where $\lambda$ is any finite positive constant. In this note we consider the distribution of $T$ for small values of $N$ and $M$ and present tables to permit use of the criterion at some conventional significance levels for small values of $N$ and $M$. The limiting distribution seems a surprisingly good approximation to the exact distribution for moderate sample sizes (corresponding to the same feature for $N\omega^2$ [6]). The accuracy of approximation is better than in the case of the two-sample Kolmogorov-Smirnov statistic studied by Hodges [4].
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
Efficiency Lower Bounds for Distribution-Free Hotelling-Type Two-Sample Tests Based on Optimal Transport
TL;DR: In this paper, the authors study the two-sample problem in the multivariate setting and propose distribution-free analogues of the Hotelling $T^2$ test (the natural multidimensional counterpart of Student's $t$-test) based on optimal transport and obtain extensions of the above celebrated results over various natural families of multivariate distributions.
9
•Journal Article
Do some firms persistently outperform
TL;DR: In this paper, persistence in growth rates of the entire population of Dutch manufacturing firms is analyzed, and the authors find that the existence of persistent outperformers is especially pronounced in micro firms.
Gaze shift behavior on video as composite information foraging
TL;DR: A model of gaze shift behavior which is driven by a composite foraging strategy operating over a time varying visual landscape and accounts for variability exhibited by different observers when viewing the same scene, or even by the same subject along different trials is presented.
Use of the Kolmogorov–Smirnov test for gamma process
Edith Grall-Maës
- 17 Oct 2012
TL;DR: In this article, the use of the Kolmogorov-smirnov test for comparing an observed gamma process with a reference process or for comparing two observed gamma processes is discussed.
9
Evaluation of Techniques for Univariate Normality Test Using Monte Carlo Simulation
TL;DR: In this article, the sensitivity of nine normality test statistics; W/S, Jaque-Bera, Adjusted Jaque Bera, D’Agostino, Sharma-Wilk, Shapiro-Francia, Ryan-Joiner, Lilliefors' and Anderson Darlings test statistics, with a view to determining the effectiveness of the techniques to accurately determine whether a set of data is from normal distribution or not.