TL;DR: In this article, the authors proposed a new test for the comparison of two regression curves, which is based on a difference of two marked empirical processes based on residuals, and the large sample behaviour of the corresponding statistic is studied to provide a full nonparametric comparison of regression curves.
Abstract: We propose a new test for the comparison of two regression curves, which is based on a difference of two marked empirical processes based on residuals. The large sample behaviour of the corresponding statistic is studied to provide a full nonparametric comparison of regression curves. In contrast to most procedures suggested in the literature the new procedure is applicable in the case of different design points and heteroscedasticity. Moreover, it is demonstrated that the proposed test detects continuous alternatives converging to the null at a rate N-1/2. In the case of equal design points the fundamental statistic reduces to a test statistic proposed by Delgado (1993) and therefore resembles in spirit classical goodness-of-fit tests. As a byproduct we explain the problems of a related test proposed by Kulasekera (1995) and Kulasekera and Wang (1997) with respect to accuracy in the approximation of the level. These difficulties mainly originate from the comparison with the quantiles of an inappropriate limit distribution. A simulation study is conducted to investigate the finite sample properties of a wild bootstrap version of the new tests.
TL;DR: In this paper, an integrated approach involving both intelligent data analysis and knowledge acquisition from experts is presented to support the development of operational protocols, and the aim is to ensure high quality standards for the protocol through empirical validation during the development, as well as lower development cost through the use of machine learning and statistical techniques.
Abstract: Operational protocols are a valuable means for quality control. However, developing operational protocols is a highly complex and costly task. We present an integrated approach involving both intelligent data analysis and knowledge acquisition from experts that supports the development of operational protocols. The aim is to ensure high quality standards for the protocol through empirical validation during the development, as well as lower development cost through the use of machine learning and statistical techniques. We demonstrate our approach of integrating expert knowledge with data driven techniques based on our effort to develop an operational protocol for the hemodynamic system.
TL;DR: In this article, three methods using nonparametric estimation techniques of the regression function are discussed for testing the equality of k regression curves from independent samples, and the authors prove asymptotic normality of all considered statistics under null hypothesis local and fixed alternatives with different rates corresponding to the various cases.
Abstract: In the problem of testing the equality of k regression curves from independent samples we discuss three methods using nonparametric estimation techniques of the regression function. The first test is based on a linear combination of estimators for the integrated variance function in the individual samples and in the combined sample. The second approach transfers the classical one-way analysis of variance to the situation of comparing nonparametric curves, while the third test compares the differences between the estimates of the individual regression functions by means of an L2-distance. We prove asymptotic normality of all considered statistics under the null hypothesis local and fixed alternatives with different rates corresponding to the various cases. Additionally consistency of a wild bootstrap version of the tests is established. In contrast to most of the procedures proposed in the literature the methods introduced in this paper are also applicable in the case of different design points in each sample and heteroscedastic errors. A simulation study is conducted to investigate the finite sample properties of the new tests and a comparison with recently proposed and related procedures is performed.
TL;DR: In this article, the authors apply graphical correlation models for analysing the partial associations between the components of multivariate time series and apply this technique to the hemodynamic system of critically ill patients monitored in intensive care.
Abstract: In critical care extremely high dimensional time series are generated by clinical information systems. This yields new perspectives of data recording and also causes a new challenge for statistical methodology. Recently graphical correlation models have been developed for analysing the partial associations between the components of multivariate time series. We apply this technique to the hemodynamic system of critically ill patients monitored in intensive care. We appraise the practical value of the procedure by reidentifying known associations between the variables. From separate analyses for different pathophysiological states we conclude that distinct clinical states can be characterised by distinct partial correlation structures.
TL;DR: In this article, the authors investigated the performance of several simultaneous multivariate outlier identification rules based on robust estimators of location and scale, and showed that the use of estimators with high finite sample breakdown point in such procedures yields a good behavior with respect to the prevention of breakdown by masking effect.
Abstract: The aim of detecting outliers in a multivariate sample can be pursued in different ways. We investigate here the performance of several simultaneous multivariate outlier identification rules based on robust estimators of location and scale. It has been shown that the use of estimators with high finite sample breakdown point in such procedures yields a good behaviour with respect to the prevention of breakdown by the masking effect (Becker, Gather 1999, J. Amer. Statist. Assoc. 94, 947-955). In this article, we investigate by simulation, at which distance from the center of an underlying model distribution outliers can be placed until certain simultaneous identification rules will detect them as outliers. We consider identification procedures based on the minimum volume ellipsoid, the minimum covariance determinant, and S-estimators.
TL;DR: It is shown that both CUSUM- tests behave fundamentally different in a long-memory environment, as compared to short memory, and that long memory is easily mistaken for structural change when standard critical values are employed.
Abstract: We derive the limiting null distributions of the standard and OLS based CUSUM-tests for structural change of the coecients of a linear regression model in the context of long memory disturbances. We show that both tests behave fundamentally different in a long memory environment, as compared to short memory, and that long memory is easily mistaken for structural change when standard critical values are employed.
TL;DR: In this paper, a confidence interval for the between group variance is proposed which is deduced from Wald's exact confidence intervals for the ratio of the two variance components in the one-way random effects model.
Abstract: A confidence interval for the between group variance is proposed which is deduced from Wald’s exact confidence interval for the ratio of the two variance components in the one-way random effects model and the exact confidence interval for the error variance resp. an unbiased estimator of the error variance. In a simulation study the confidence coefficients for these two intervals are compared with the confidence coefficients of two other commonly used confidence intervals. There, the confidence interval derived here yields confidence coefficients which are always greater than the prescribed level.
TL;DR: In this article, it was shown that the optimal weight function in the test of Gonzalez Manteiga and Vilar Fernandez (1995) is given by the Lebesgue measure independently of the design density.
Abstract: In a recent paper Gonzalez Manteiga and Vilar Fernandez (1995) considered the problem of testing linearity of a regression under MA structure of the errors using a weighted L1-distance between a parametric and a nonparametric fit. They established asymptotic normality of the corresponding test statistic under the hypothesis and under local alternatives. In the present paper we extend these results and establish asymptotic normality of the statistic under fixed alternatives. This result is then used to prove that the optimal (with respect to uniform maximization of power) weight function in the test of Gonzalez Manteiga and Vilar Fernandez (1995) is given by the Lebesgue measure independently of the design density_ The paper also discusses several extensions of tests proposed by Azzalini and Bow_ man (1993) Zheng (1996) and Dette (1999) to the case of non-independent errors and compares these methods with the method of Gonzalez Manteiga and Vilar Fernandez (1995). It is demonstrated that among the kernel based methods the approach of the latter authors is the most efficient from an asymptotic point of view.
TL;DR: For testing both one-sided and two-sided hypotheses concerning several treatment arms in group sequentially performed clinical trials with arbitrary outcome variables, a general method is considered that allows one to completely self-design a study.
Abstract: For testing one-sided but also two-sided hypotheses concerning several treatment arms in group sequentially performed clinical trials with arbitrary outcome variables, a general learning method is considered that allows for a complete self-designing of the study. All information available prior to a stage is used for estimating the sample size and the weight for the next step. In ‘using up’ the variance, the test statistic is built in a bounded finite but random number of stages to test just once the null-hypothesis on rejecting.
TL;DR: The authors examined the behaviour of the Canadian dollar from 1997 to 1999 to see if there is any evidence of excess volatility or significant overshooting, and used a small econometric model of the exchange rate, based on market fundamentals, to make tentative judgments about the extent to which the currency might have been systematically over- or undervalued.
Abstract: This paper examines the behaviour of the Canadian dollar from 1997 to 1999 to see if there is any evidence of excess volatility or significant overshooting. A small econometric model of the exchange rate, based on market fundamentals, is presented and used to make tentative judgments about the extent to which the currency might have been systematically over- or undervalued.
TL;DR: In this article, the modified median polish kriging (MPMK) method is combined with the universal Kriging for robust spatial prediction, which is a modified version of the MPMK method.
Abstract: In geostatistics, spatial data will be analysed that often come from irregularly distributed sampling locations. Interest is in modelling the data_ i.e. estimating distributional parameters and then to predict the phenomenon under study at unobserved sites within the corresponding sampling domain. The method of universal kriging for spatial prediction was introduced to cover the problem of spatial trend effects. This is done by incorporating linear trend models e.g. polynomial functions of the spatial coordinates. However, universal kriging is sensitive to additive outliers. An outlier resistant method for spatial prediction is median polish kriging. Both methods have certain advantages but also some drawbacks. Here, universal kriging and median polish kriging will be combined to the robust spatial prediction method called modified median polish kriging. An example illustrates the method of modified median polish kriging along with piezometric_head data from the Wolfcampn Aquifer.
TL;DR: In this article, the authors provide a proof of Granger's error correction model for fractionally cointegrated variables and point out a necessary assumption that has not been noted before, and propose a simpler, alternative ECC model which can be employed to estimate fractional co-integrated systems in three steps.
Abstract: This note provides a proof of Granger's (1986) error correction model for fractionally cointegrated variables and points out a necessary assumption that has not been noted before. Moreover, a simpler, alternative error correction model is proposed which can be employed to estimate fractionally cointegrated systems in three steps.
TL;DR: The authors distinguish between three types of outliers in a one-way random effects model and propose simple rules for identifying such outliers and give an example which involves median-based statistics.
Abstract: We distinguish between three types of outliers in a one-way random effects model. These are formally described in terms of their position relative to the main part of the observations. We propose simple rules for identifying such outliers and give an example which involves median-based statistics.
TL;DR: In this article, it was shown that the problem of deciding K x K-satisfiability of formulas of modal depth two is already hard for nondeterministic exponential time, and provided a matching upper bound.
Abstract: The aim of this paper is to exemplify the complexity of the satisfiability problem of products of modal logics. Our main goal is to arouse interest for the main open problem in this area: a tight complexity bound for the satisfiability problem of the product K x K. At present, only non-elementary decision procedures for this problem are known. Our modest contribution is two-fold. We show that the problem of deciding K x K-satisfiability of formulas of modal depth two is already hard for nondeterministic exponential time, and provide a matching upper bound. For the full language, a new proof for decidability is given which combines filtration and selective generation techniques from modal logic. We put products of modal logics into an historic perspective and review the most important results.
TL;DR: In this paper business cycles are considered as a multivariate phenomenon and not as a univariate one determined e.g. by the GNP.
Abstract: In this paper business cycles are considered as a multivariate phenomenon and not as a univariate one determined e.g. by the GNP. The subject is to look for the number of phases of a business cycle, which can be motivated by the number of clusters in a given dataset of macro-economical variables. Different approaches to distances in the data are tried in a fuzzy cluster analysis to pursue this goal.
TL;DR: This paper derived the limiting null distribution of the robust CUSUM-M test and the recursive CUsUM-m test for structural change of the coefficients of a linear regression model with long-memory disturbances.
Abstract: We derive the limiting null distribution of the robust CUSUM-M test and the recursive CUSUM-M test for structural change of the coefficients of a linear regression model with long-memory disturbances It turns out that the asymptotic null distribution of the CUSUM-M statistic is a fractional Brownian Bridge and the asymptotic null distribution of the recursive CUSUM-M statistic is fractional Brownian motion
TL;DR: In this paper, positive estimators of the between-group variance are proposed and approximate confidence intervals for the variance are constructed, by Monte Carlo simulation, and the bias and standard deviation of the proposed estimators are compared with the truncated versions of the maxi- mum likelihood estimator, restricted maximum likelihood (REML) estimator and a (lately) standard estimator in meta-analysis.
Abstract: Positive estimators of the between-group (between-study) variance are proposed. Explicit variance formulae for the estimators are given and approximate confidence intervals for the between-group variance are constructed, as our proposal to a long outstanding problem. By Monte Carlo simulation, the bias and standard deviation of the proposed estimators are compared with the truncated versions of the maxi- mum likelihood (ML) estimator, restricted maximum likelihood (REML) estimator and a (lately) standard estimator in meta-analysis. Attained confidence coefficients of the constructed confidence intervals are also presented.
TL;DR: This paper compares three approaches to approximating the minimum number of misclassifications achievable with afine hyperplanes using the regression depth method of Rousseeuw and Hubert in linear regression models, a support vector machine approach proposed by Vapnik (1998), and a heuristic search algorithm.
Abstract: The minimum number of misclassifications achievable with affine hyper_ planes on a given set of labeled points is a key quantity in both statistics and computational learning theory. However, determining this quantity exactly is essentially NP_hard_ cf_ Hofgen, Simon and van Horn (1995.) Hence, there is a need to find reasonable approximation procedures. This paper compares three approaches to approximating the minimum number of misclassifications achievable with afine hyperplanes. The first approach is based on the regression depth method of Rousseeuw and Hubert (1999) in linear regression models. We compare the results of the regression depth method with the support vector machine approach proposed by Vapnik (1998) and a heuristic search algorithm.
TL;DR: Two nonlinear four-stage hierarchical models for a repeated measurement design and for repeated exposures to different doses are presented and the estimation of the individual and population parameters as well as of the covariance matrices is performed by an EM algorithm.
Abstract: A basic part in the risk assessment of potential carcinogens is the determination of toxicokinetic parameters. The partition of the xenobiotic in the body of experimental animals is a first step of the biochemical pathway of the formation of DNA adducts which might lead to the development of cancer. Fundamental in the extrapolation from one species to another is the characterisation of processes by means of population parameters. Nevertheless, the consideration of individual parameters varying between repeated experiments and different doses is of great importance to obtain a more precise insight into the variability structure of the process so that a valid basis for further research is the final result. Two nonlinear four-stage hierarchical models for a repeated measurement design and for repeated exposures to different doses are presented. The estimation of the individual and population parameters as well as of the covariance matrices is performed by an EM algorithm.
TL;DR: In this article, the authors used Monte-Carlo estimators for estimating exact solutions instead of analytical solutions and therefore to improve the estimation process for Desirabilities, and they were satisfied with approximative solutions as unbiased results would have required analytical expressions for the distributions of Desirability Indices.
Abstract: As will be shown the current use of Desirability Indices for optimisation purposes in experimental design gives biased results in general. Researchers were satisfied with approximative solutions as unbiased results would have required analytical expressions for the distributions of Desirability Indices. These expressions are unavailable. Today’s computing power allows to use Monte-Carlo estimators for estimating exact solutions instead of analytical solutions and therefore to improve the estimation process for Desirabilities.
TL;DR: In this article, the authors presented the application of unsupervised neural networks (self-organizing maps) to different domains, such as sleep apnea discovery, protein sequences analysis and tumor classification.
Abstract: This paper presents the application of special unsupervised neural networks (self-organizing maps) to different domains, as sleep apnea discovery, protein sequences analysis and tumor classification. An enhancement of the original algorithm, as well as the introduction of several hierachical levels enables the discovery of complex structures as present in this type of applications. Furthermore, an integration of unsupervised neural networks with hidden markov models is proposed.
TL;DR: In this paper, the authors derive the probability limit of the standard Dickey-Fuller test in the context of an exponential random walk, which is useful in interpreting tests for unit roots when the test is inadvertantly applied to the levels of the data when the true random walk is in the logs.
Abstract: We derive the probability limit of the standard Dickey-Fuller-test in the context of an exponential random walk This result might be useful in interpreting tests for unit roots when the test is inadvertantly applied to the levels of the data when the true random walk is in the logs
TL;DR: In this article, an approach is presented to determine the processes of uptake, elimination, and metabolism of the gas ethylene, an important industrial chemical, which is classified in category 3 of carcinogenic substances in the German list of MAK- and BAT-values.
Abstract: A basic step in the risk assessment of potential carcinogens is the determination of toxicokinetic parameters. The present approach is part of a strategy to determine the processes of uptake, elimination, and metabolism of the gas ethylene, an important industrial chemical, which is classified in category 3 of carcinogenic substances in the German list of MAK- and BAT-values. This paper deals with the calibration, which is indispensable to determine the decline of atmospheric concentrations of ethylene within a broad range of initial concentrations applied in an inhalation experiment.
TL;DR: In this paper, a hebelwirkungshypothese is defined, i.e., the Beziehung zwischen ersten and zweiten Momenten (Asymmetrie).
Abstract: Zusammenfassung Moderne Varianten von ARCHund GARCH-Modellen für Kapitalmarktdaten (z.B. EGARCH, GJR-GARCH, A-PARCH) berücksichtigen auÿer den zweiten Momenten auch die Beziehung zwischen ersten und zweiten Momenten (Asymmetrie). Eine Erklärung dieser Asymmetrie ist die Hebelwirkungshypothese. Die Hebelwirkung (leverage e ect) ist jedoch nur für bestimmte Arten von Kapitalmarktdaten relevant und damit als Erklärung der Asymmetrie heranziehbar. Diese Arbeit weist empirisch nach, daÿ asymmetrische GARCH-Modelle in der Tat vor allem für Aktienrenditen und weniger für Änderungsraten von Wechselkursen den Erklärungsgehalt erhöhen.
TL;DR: In this article, a class of self-designing clinical trials is considered which according to an effective but simple, finite learning algorithm consists of automatically adaptively planned weighted group sequential trials with a decision about rejection of the null-hypothesis at each step, but the full level-α-test at the end of the study preserved.
Abstract: A class of self-designing clinical trials is considered which according to an effective but simple, finite learning algorithm consists of automatically adaptively planned weighted group sequential trials with a decision about rejection of the null-hypothesis at each step, but the full level-α-test at the end of the study preserved
TL;DR: In this paper, the authors describe the principles of polymerase chain reaction (PCR) and its expanding use in molecular genetic research and molecular medicine, and give a statistical model for the PCR and discuss estimation methods to quantify the lack of PCR accuracy.
Abstract: In this paper we describe the principles of polymerase chain reaction (PCR) and its expanding use in molecular genetic research and molecular medicine. A short introduction of exemplary applications of the PCR is connected with a discussion of the lack of PCR accuracy. We give a statistical model for the PCR and discuss estimation methods in order to quantify the lack of PCR accuracy.