TL;DR: In this paper, two nonparametric imputation schemes are considered: risk set imputation and Kaplan-Meier imputation, where the censored time is replaced by a random draw of the observed times amongst those at risk after the censoring time.
TL;DR: A multiple imputation method for sensitivity analyses of time-to-event data with possibly informative censoring with a variety of specifications regarding the post-discontinuation tendency of having events can be incorporated in the imputation.
Abstract: This article presents a multiple imputation method for sensitivity analyses of time-to-event data with possibly informative censoring. The imputed time for censored values is drawn from the failure time distribution conditional on the time of follow-up discontinuation. A variety of specifications regarding the post-discontinuation tendency of having events can be incorporated in the imputation through a hazard ratio parameter for discontinuation versus continuation of follow-up. Multiple-imputed data sets are analyzed with the primary analysis method, and the results are then combined using the methods of Rubin. An illustrative example is provided.
TL;DR: In this article, a machine learning method for data imputation of GPS time series, missForest, was used to restore the information of missing values in the GPS data and improve the results of related statistical processes, such as PCA.
Abstract: The global positioning system (GPS) can provide the daily coordinate time series to help geodesy and geophysical studies. However, due to logistics and malfunctioning, missing values are often “seen” in GPS time series, especially in polar regions. Acquiring a consistent and complete time series is the prerequisite for accurate and reliable statical analysis. Previous imputation studies focused on the temporal relationship of time series, and only a few studies used spatial relationships and/or were based on machine learning methods. In this study, we impute 20 Greenland GPS time series using missForest, which is a new machine learning method for data imputation. The imputation performance of missForest and that of four traditional methods are assessed, and the methods’ impacts on principal component analysis (PCA) are investigated. Results show that missForest can impute more than a 30-day gap, and its imputed time series has the least influence on PCA. When the gap size is 30 days, the mean absolute value of the imputed and true values for missForest is 2.71 mm. The normalized root mean squared error is 0.065, and the distance of the first principal component is 0.013. missForest outperforms the other compared methods. missForest can effectively restore the information of GPS time series and improve the results of related statistical processes, such as PCA analysis.
TL;DR: In this article, a method of generating imputed location fixes of a mobile communication device is proposed, which comprises accessing event data created by a mobile device that comprises a time stamp, determining a plurality of location fixes, analyzing the location fixes to determine a plurality, and determining a travel route of the mobile device based on the analysis of location centroids and time stamps of event data.
Abstract: A method of generating imputed location fixes of a mobile communication device. The method comprises accessing event data created by a mobile communication device that comprises a time stamp, determining a plurality of location fixes of the mobile communication device, analyzing the location fixes to determine a plurality of location clusters associated with the mobile communication device, determining a location centroid of each location cluster, analyzing the location centroids and the time stamps of the event data, determining a travel route of the mobile communication device based on the analysis of location centroids and time stamps of event data, and determining a plurality of imputed location fixes of the mobile communication device at positions along the travel route of the mobile communication device, where the imputed location fixes comprise an imputed location and an imputed time stamp.
TL;DR: In this article, a detailed analysis of time series of GRID quality parameter, which specifies averaged time of execution of one job waiting in a cluster queue, is presented, and seasonalities of variation of parameter value are shown graphically.
Abstract: Most attention in the paper is paid to GRID quality parameter, which specifies averaged time of execution of one job waiting in a cluster queue. Detailed analysis of time series of this parameter, which is received from monitoring systems of academic BalticGrid network, is presented. Seasonalities of variation of parameter value are shown graphically. Problematic forecasting cases of this parameter are reviewed. Forecasting methodology, which involves a model of data imputation, is proposed. The experiment is carried out, using forecasting methods that are prevailing in practice (AR, nAR, ARMA, SARIMA, CF technique). Ill. 10, bibl. 10, tabl. 1 (in English; abstracts in English and Lithuanian). http://dx.doi.org/10.5755/j01.eee.114.8.706