TL;DR: In this paper, the authors extend a method of Hozo et al. to estimate mean and standard deviation from median, minimum, and maximum to the case where quartiles are also available.
Abstract: Background : We sometimes want to include in a meta-analysis data from studies where results are presented as medians and ranges or interquartile ranges rather than as means and standard deviations In this paper I extend a method of Hozo et al to estimate mean and standard deviation from median, minimum, and maximum to the case where quartiles are also available Methods : Inequalities are developed for each observation using upper and lower limits derived from the minimum, the three quartiles, and the maximum These are summed to give bounds for the sum and hence the mean of the observations, the average of these bounds in the estimate A similar estimate is found for the sum of the observations squared and hence for the variance and standard deviation Results : For data from a Normal distribution, the extended method using quartiles gives good estimates of sample means but sample standard deviations are overestimated For data from a Lognormal distribution, both sample mean and standard deviation are overestimated Overestimation is worse for larger samples and for highly skewed parent distributions The extended estimates using quartiles are always superior in both bias and precision to those without Conclusions : The estimates have the advantage of being extremely simple to carry out I argue that as, in practice, such methods will be applied to small samples, the overestimation may not be a serious problem
TL;DR: The above simulations suggest that increased tracking in extreme quartiles, as they define tracking, is an expected property of correlated variables and may have no bearing on the biology underlying BMI.
Abstract: kg/m 2 ) of children in China. They concluded that the BMIs of thin and fat children (defined as those in the lowest and highest quartiles of BMI, respectively) were more likely to track (ie, remain in the same quartile of BMI at follow-up). We agree with Wang et al that this is an important area of study but are concerned that their analysis of tracking may be misleading. They stratified the children by quartile of BMI at the first examination (see Table 2 of their article) and then calculated the percentage of those who tracked or alternatively moved up or down the quartile ranking. Where variables are correlated, we suspected that the likelihood of the quartile changing might not be the same for all quartiles, being less in the extreme quartiles. We suggest that this statistical property, rather than any underlying biologic factor, may have been the reason for the increase in tracking that Wang et al reported in the extreme quartiles of initial BMI. We demonstrated this effect by simulation. We generated 1000 observations (x) to represent a first measurement of BMI that was generated randomly from a normal distribution by using the RANNOR function of SAS (SAS Institute Inc, Cary, NC). A second correlated (r = 0.7) observation (y), to simulate a later measurement of BMI, was generated by the formula y = x + z, where z is a further random number generated by RANNOR. Observations x and y were then ranked and examined for evidence of tracking in a manner similar to that used by Wang et al. An increase in tracking was apparent in the extreme quartiles. Thus, in this first simulation, 62% of those in the lowest and highest quartiles of x tracked (ie, were also in the lowest and highest quartile of y, respectively) and only 34% in quartiles 2 and 3 tracked. A repeat of this simulation (1000 times) showed that with this degree of correlation, tracking occurred in 60.7% of observations on average (95% CI: 56.8, 64.4) in the extreme quartiles (1 and 4) and in 36.0% of observations (31.6, 39.6) in the central quartiles (2 and 3). These values changed depending on the degree of correlation between the 2 variables (data not shown), approaching 25% for all quartiles as the degree of correlation decreased to 0. Simulation also showed that the extent of tracking in individual quartiles was potentially affected by the underlying distribution of the data (data not shown). Use of correlation coefficients within quartiles (rather than the percentage of tracking) is subject to similar problems. In the simulated set described above, which had a normal distribution and a Pearson’s correlation coefficient of 0.7, correlation coefficients of x and y were highly dependent on the quartile; the Spearman correlation coefficient was 0.40 (95% CI: 0.30, 0.49) within quartiles 1 and 4 and was 0.19 (0.09, 0.3) within quartiles 2 and 3. Given these simulations, we believe that the conclusion of Wang et al regarding tracking of BMI in the extreme quartiles may be misleading. In their Table 4, Wang et al reported that tracking of overweight increases if at least one parent is overweight and that, conversely, tracking of underweight increases if at least one parent is underweight. This appears to be an important observation; however, we are concerned that their data may be confounded by the same problem. If children of overweight and underweight parents lie in extreme quartiles of the first observation of BMI, they may artifactually appear to track more strongly. Wang et al asked an important biological question: do the biological variables of individuals identified by place in the population distribution or by other factors such as family history differ in stability over time? The above simulations suggest that increased tracking in extreme quartiles, as they define tracking, is an expected property of correlated variables and may have no bearing on the biology underlying BMI.
TL;DR: In this paper, a modified ratio estimator is proposed for estimation of population mean using the quartiles and its functions of the auxiliary variable. But, the proposed method is not suitable for the estimation of the entire population.
Abstract: The present paper deals with some new modified ratio estimators for estimation of population mean using the quartiles and its functions of the auxiliary variable. The bias and the mean squared error of the proposed estimators are obtained and are compared with some of the existing modified ratio estimators. As a result, we have observed that the proposed modified ratio estimators perform better than the existing modified ratio estimators. These are explained with the help of numerical examples. value of the population quartiles and their functions of the auxiliary variable to improve the ratio estimators. Further we know that the value of quartiles and their functions are unaffected and robustness by the extreme values or the presence of outliers in the population values unlike the other parameters like the mean, coefficient of variation, coefficient of skewness and coefficient of kurtosis etc. These points discussed above have motivated us to introduce a modified ratio estimator using the known value of the population quartiles and their functions of the auxiliary variable. There are three quartiles called, first quartile, second quartile and third quartile. The second quartile is equal to the median. The first quartile is also called lower quartile and is denoted by . The third quartile is also called upper quartile and is denoted by . The lower quartile is a point which has 25% observations less than it and 75% observations are above it. The upper quartile is a point with 75% observations less than it and 25% observations are above it. The inter-quartile range is another range used as a measure of the spread. The difference between upper and lower quartiles , which is called the inter-quartile range, also indicates the dispersion of a data set. The inter-quartile range spans 50% of a data set, and eliminates the influence of outliers because, in effect, the highest and lowest quarters are removed. The formula for inter-quartile range is:
TL;DR: In this paper, the authors proposed some ratio-type estimators of finite population variance using known values of parameters related to an auxiliary variable such as quartiles with their properties in simple random sampling.
Abstract: In this paper we have proposed some ratio-type estimators of finite population variance using known values of parameters related to an auxiliary variable such as quartiles with their properties in simple random sampling. The suggested estimators have been compared with the usual unbiased and ratio estimators and the estimators due to [2], [12, 13, 14] and [3]. An empirical study is also carried out to judge the merits of the proposed estimator over other existing estimators of population variance using natural data set. 2000 AMS Classification: 62D05
TL;DR: In this article, it was shown that in the symmetric case trimmed means are better estimates of location than sample medians, unless the error distributions are sharply peaked at the centre of the distribution.