TL;DR: A point estimator and its associated confidence interval for the size of a closed population are proposed under models that incorporate heterogeneity of capture probability andumerical results show that the proposed confidence interval performs satisfactorily in maintaining the nominal levels.
Abstract: A point estimator and its associated confidence interval for the size of a closed population are proposed under models that incorporate heterogeneity of capture probability. Real data sets provided in Edwards and Eberhardt (1967, Journal of Wildlife Management 31, 87-96) and Carothers (1973, Journal of Animal Ecology 42, 125-146) are used to illustrate this method and to compare it with other estimates. The performance of the proposed procedure is also investigated by means of Monte Carlo experiments. The method is especially useful when most of the captured individuals are caught once or twice in the sample, for which case the jackknife estimator usually does not work well. Numerical results also show that the proposed confidence interval performs satisfactorily in maintaining the nominal levels.
TL;DR: A probabilistic setting is used which allows us to obtain posterior distributions on these performance indicators, rather than point estimates, and is applied to the case where different methods are run on different datasets from the same source.
Abstract: We address the problems of 1/ assessing the confidence of the standard point estimates, precision, recall and F-score, and 2/ comparing the results, in terms of precision, recall and F-score, obtained using two different methods. To do so, we use a probabilistic setting which allows us to obtain posterior distributions on these performance indicators, rather than point estimates. This framework is applied to the case where different methods are run on different datasets from the same source, as well as the standard situation where competing results are obtained on the same data.
TL;DR: It is shown that both problems can be overcome by replacing the conventional point estimate of accuracy by an estimate of the posterior distribution of the balanced accuracy.
Abstract: Evaluating the performance of a classification algorithm critically requires a measure of the degree to which unseen examples have been identified with their correct class labels. In practice, generalizability is frequently estimated by averaging the accuracies obtained on individual cross-validation folds. This procedure, however, is problematic in two ways. First, it does not allow for the derivation of meaningful confidence intervals. Second, it leads to an optimistic estimate when a biased classifier is tested on an imbalanced dataset. We show that both problems can be overcome by replacing the conventional point estimate of accuracy by an estimate of the posterior distribution of the balanced accuracy.
TL;DR: There is little empirical support for the use of .05 or any other value as universal cutoff values to determine adequate model fit, regardless of whether the point estimate is used alone or jointly with the confidence interval.
Abstract: This article is an empirical evaluation of the choice of fixed cutoff points in assessing the root mean square error of approximation (RMSEA) test statistic as a measure of goodness-of-fit in Structural Equation Models. Using simulation data, the authors first examine whether there is any empirical evidence for the use of a universal cutoff, and then compare the practice of using the point estimate of the RMSEA alone versus that of using it jointly with its related confidence interval. The results of the study demonstrate that there is little empirical support for the use of .05 or any other value as universal cutoff values to determine adequate model fit, regardless of whether the point estimate is used alone or jointly with the confidence interval. The authors' analyses suggest that to achieve a certain level of power or Type I error rate, the choice of cutoff values depends on model specifications, degrees of freedom, and sample size.