TL;DR: This work presents permutation tests to assess the statistical significance of species-site group associations and bootstrap methods for obtaining confidence intervals, which includes several new indices.
Abstract: Ecologists often face the task of studying the association between single species and one or several groups of sites representing habitat types, community types, or other categories. Besides characterizing the ecological preference of the species, the strength of the association usually presents a lot of interest for conservation biology, landscape mapping and management, and natural reserve design, among other applications. The indices most frequently employed to assess these relationships are the phi coefficient of association and the indicator value index (IndVal). We compare these two approaches by putting them into a broader framework of related measures, which includes several new indices. We present permutation tests to assess the statistical significance of species-site group associations and bootstrap methods for obtaining confidence intervals. Correlation measures, such as the phi coefficient, are more context-dependent than indicator values but allow focusing on the preference of the species. In contrast, the two components of an indicator value index directly assess the value of the species as a bioindicator because they can be interpreted as its positive predictive value and sensitivity. Ecologists should select the most appropriate index of association strength according to their objective and then compute confidence intervals to determine the precision of the estimate.
TL;DR: The utility and interpretation of the standardized difference for comparing the prevalence of dichotomous variables between two groups is explored, and a standardized difference of 10% is equivalent to having a phi coefficient of 0.05 for the correlation between treatment group and the binary variable.
Abstract: Researchers are increasingly using the standardized difference to compare the distribution of baseline covariates between treatment groups in observational studies. Standardized differences were initially developed in the context of comparing the mean of continuous variables between two groups. However, in medical research, many baseline covariates are dichotomous. In this article, we explore the utility and interpretation of the standardized difference for comparing the prevalence of dichotomous variables between two groups. We examined the relationship between the standardized difference, and the maximal difference in the prevalence of the binary variable between two groups, the relative risk relating the prevalence of the binary variable in one group compared to the prevalence in the other group, and the phi coefficient for measuring correlation between the treatment group and the binary variable. We found that a standardized difference of 10% (or 0.1) is equivalent to having a phi coefficient of 0.05 ...
TL;DR: This paper explores the origin of these limitations, and introduces an alternative and more stable agreement coefficient referred to as the AC1 coefficient, and proposes new variance estimators for the multiple-rater generalized pi and AC1 statistics, whose validity does not depend upon the hypothesis of independence between raters.
Abstract: Pi (pi) and kappa (kappa) statistics are widely used in the areas of psychiatry and psychological testing to compute the extent of agreement between raters on nominally scaled data. It is a fact that these coefficients occasionally yield unexpected results in situations known as the paradoxes of kappa. This paper explores the origin of these limitations, and introduces an alternative and more stable agreement coefficient referred to as the AC1 coefficient. Also proposed are new variance estimators for the multiple-rater generalized pi and AC1 statistics, whose validity does not depend upon the hypothesis of independence between raters. This is an improvement over existing alternative variances, which depend on the independence assumption. A Monte-Carlo simulation study demonstrates the validity of these variance estimators for confidence interval construction, and confirms the value of AC1 as an improved alternative to existing inter-rater reliability statistics.
TL;DR: In this article, Bruelheide's u value is defined as an asymmetric measure of the fidelity of a species to a vegetation unit which tends to assign comparatively high fidelity values to rare species.
Abstract: Statistical measures of fidelity, i.e. the concentration of species occurrences in vegetation units, are reviewed and compared. The focus is on measures suitable for categorical data which are based on observed species frequencies within a vegetation unit compared with the frequencies expected under random distribution. Particular attention is paid to Bruelheide's u value. It is shown that its original form, based on binomial distribution, is an asymmetric measure of fidelity of a species to a vegetation unit which tends to assign comparatively high fidelity values to rare species. Here, a hypergeometric form of u is introduced which is a symmetric measure of the joint fidelity of species to a vegetation unit and vice versa. It is also shown that another form of the binomial u value may be defined which measures the asymmetric fidelity of a vegetation unit to a species. These u values are compared with phi coefficient, chi‐square, G statistic and Fisher's exact test. Contrary to the other measure...
TL;DR: In this paper, the authors proposed a new method of measuring fidelity with presence/absence data after equalization of the size of the site groups, where the number of site groups in the data set is equalized, while relative frequencies of species occurrence within and outside of these groups are kept constant.
Abstract: Aim: Concentration of species occurrences in groups of classified sites can be quantified with statistical measures of fidelity, which can be used for the determination of diagnostic species. However, for most available measures fidelity depends on the number of sites within individual groups. As the classified data sets typically contain site groups of unequal size, such measures do not enable a comparison of numerical fidelity values of species between different site groups. We therefore propose a new method of measuring fidelity with presence/absence data after equalization of the size of the site groups. We compare the properties of this new method with other measures of statistical fidelity, in particular with the Dufrene-Legendre Indicator Value (IndVal) index. Methods: The size of site groups in the data set is equalized, while relative frequencies of species occurrence within and outside of these groups are kept constant. Then fidelity is calculated using the phi coefficient of associatio...