TL;DR: An overview of statistical decision theory, which emphasizes the use and application of the philosophical ideas and mathematical structure of decision theory.
Abstract: 1. Basic concepts 2. Utility and loss 3. Prior information and subjective probability 4. Bayesian analysis 5. Minimax analysis 6. Invariance 7. Preposterior and sequential analysis 8. Complete and essentially complete classes Appendices.
TL;DR: Bayes factors have been advocated as superior to pp-values for assessing statistical evidence in data as mentioned in this paper, and they have been widely used in the literature for assessing power law and skill acquisition.
TL;DR: The libFM as mentioned in this paper tool is a software implementation for factorization machines that features stochastic gradient descent (SGD) and alternating least-squares (ALS) optimization, as well as Bayesian inference using Markov Chain Monto Carlo (MCMC).
Abstract: Factorization approaches provide high accuracy in several important prediction problems, for example, recommender systems. However, applying factorization approaches to a new prediction problem is a nontrivial task and requires a lot of expert knowledge. Typically, a new model is developed, a learning algorithm is derived, and the approach has to be implemented.Factorization machines (FM) are a generic approach since they can mimic most factorization models just by feature engineering. This way, factorization machines combine the generality of feature engineering with the superiority of factorization models in estimating interactions between categorical variables of large domain. libFM is a software implementation for factorization machines that features stochastic gradient descent (SGD) and alternating least-squares (ALS) optimization, as well as Bayesian inference using Markov Chain Monto Carlo (MCMC). This article summarizes the recent research on factorization machines both in terms of modeling and learning, provides extensions for the ALS and MCMC algorithms, and describes the software tool libFM.
TL;DR: It is shown that PS and SS sampling substantially outperform these estimators and adjust the conclusions made concerning previous analyses for the three real-world data sets that were reanalyzed.
Abstract: Recent developments in marginal likelihood estimation for model selection in the field of Bayesian phylogenetics and molecular evolution have emphasized the poor performance of the harmonic mean estimator (HME). Although these studies have shown the merits of new approaches applied to standard normally distributed examples and small real-world data sets, not much is currently known concerning the performance and computational issues of these methods when fitting complex evolutionary and population genetic models to empirical real-world data sets. Further, these approaches have not yet seen widespread application in the field due to the lack of implementations of these computationally demanding techniques in commonly used phylogenetic packages. We here investigate the performance of some of these new marginal likelihood estimators, specifically, path sampling (PS) and stepping-stone (SS) sampling for comparing models of demographic change and relaxed molecular clocks, using synthetic data and real-world examples for which unexpected inferences were made using the HME. Given the drastically increased computational demands of PS and SS sampling, we also investigate a posterior simulation-based analogue of Akaike’s information criterion (AIC) through Markov chain Monte Carlo (MCMC), a model comparison approach that shares with the HME the appealing feature of having a low computational overhead over the original MCMC analysis. We confirm that the HME systematically overestimates the marginal likelihood and fails to yield reliable model classification and show that the AICM performs better and may be a useful initial evaluation of model choice but that it is also, to a lesser degree, unreliable. We show that PS and SS sampling substantially outperform these estimators and adjust the conclusions made concerning previous analyses for the three real-world data sets that we reanalyzed. The methods used in this article are now available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses.
TL;DR: Introduction: Probability and Parameters Probability Probability distributions Calculating properties of probability distributions Monte Carlo integration Monte Carlo Simulations Using BUGS using BUGs to simulate from distributions Transformations of random variables Complex calculations using Monte Carlo Multivariate Monte Carlo analysis Predictions with unknown parameters
Abstract: Introduction: Probability and Parameters Probability Probability distributions Calculating properties of probability distributions Monte Carlo integration Monte Carlo Simulations Using BUGS Introduction to BUGS DoodleBUGS Using BUGS to simulate from distributions Transformations of random variables Complex calculations using Monte Carlo Multivariate Monte Carlo analysis Predictions with unknown parameters Introduction to Bayesian Inference Bayesian learning Posterior predictive distributions Conjugate Bayesian inference Inference about a discrete parameter Combinations of conjugate analyses Bayesian and classical methods Introduction to Markov Chain Monte Carlo Methods Bayesian computation Initial values Convergence Efficiency and accuracy Beyond MCMC Prior Distributions Different purposes of priors Vague, 'objective' and 'reference' priors Representation of informative priors Mixture of prior distributions Sensitivity analysis Regression Models Linear regression with normal errors Linear regression with non-normal errors Nonlinear regression with normal errors Multivariate responses Generalised linear regression models Inference on functions of parameters Further reading Categorical Data 2 x 2 tables Multinomial models Ordinal regression Further reading Model Checking and Comparison Introduction Deviance Residuals Predictive checks and Bayesian p-values Model assessment by embedding in larger models Model comparison using deviances Bayes factors Model uncertainty Discussion on model comparison Prior-data conflict Issues in Modelling Missing data Prediction Measurement error Cutting feedback New distributions Censored, truncated and grouped observations Constrained parameters Bootstrapping Ranking Hierarchical Models Exchangeability Priors Hierarchical regression models Hierarchical models for variances Redundant parameterisations More general formulations Checking of hierarchical models Comparison of hierarchical models Further resources Specialised Models Time-to-event data Time series models Spatial models Evidence synthesis Differential equation and pharmacokinetic models Finite mixture and latent class models Piecewise parametric models Bayesian nonparametric models Different Implementations of BUGS Introduction BUGS engines and interfaces Expert systems and MCMC methods Classic BUGS WinBUGS OpenBUGS JAGS A Appendix: BUGS Language Syntax Introduction Distributions Deterministic functions Repetition Multivariate quantities Indexing Data transformations Commenting B Appendix: Functions in BUGS Standard functions Trigonometric functions Matrix algebra Distribution utilities and model checking Functionals and differential equations Miscellaneous C Appendix: Distributions in BUGS Continuous univariate, unrestricted range Continuous univariate, restricted to be positive Continuous univariate, restricted to a finite interval Continuous multivariate distributions Discrete univariate distributions Discrete multivariate distributions Bibliography Index
TL;DR: Technical aspects are not the focus of Principles of Applied Statistics, so this also explains why it does not dwell intently on nonparametric models.
Abstract: Paperback: 276 pages Publisher: Cambridge University Press and Institute of Mathematical Statistics Year: 2010 Language: English ISBN-13: 978-0-5211-9249-1 Large-Scale Inference: Empirical Bayes Me...
TL;DR: This work shows how to construct appropriate summary statistics for ABC in a semi‐automatic manner, and shows that optimal summary statistics are the posterior means of the parameters.
Abstract: Summary. Many modern statistical applications involve inference for complex stochastic models, where it is easy to simulate from the models, but impossible to calculate likelihoods. Approximate Bayesian computation (ABC) is a method of inference for such models. It replaces calculation of the likelihood by a step which involves simulating artificial data for different parameter values, and comparing summary statistics of the simulated data with summary statistics of the observed data. Here we show how to construct appropriate summary statistics for ABC in a semi-automatic manner. We aim for summary statistics which will enable inference about certain parameters of interest to be as accurate as possible. Theoretical results show that optimal summary statistics are the posterior means of the parameters. Although these cannot be calculated analytically, we use an extra stage of simulation to estimate how the posterior means vary as a function of the data; and we then use these estimates of our summary statistics within ABC. Empirical results show that our approach is a robust method for choosing summary statistics that can result in substantially more accurate ABC analyses than the ad hoc choices of summary statistics that have been proposed in the literature. We also demonstrate advantages over two alternative methods of simulation-based inference.
TL;DR: A comparison with recent implementations of path sampling and stepping-stone sampling shows reassuringly that MAP identification and its Bayes factor provide similar performance to PS and SS and that these approaches considerably outperform HME, sHME, and AICM in selecting the correct underlying clock model.
Abstract: Recent implementations of path sampling (PS) and stepping-stone sampling (SS) have been shown to outperform the harmonic mean estimator (HME) and a posterior simulation-based analog of Akaike’s information criterion through Markov chain Monte Carlo (AICM), in Bayesian model selection of demographic and molecular clock models. Almost simultaneously, a Bayesian model averaging approach was developed that avoids conditioning on a single model but averages over a set of relaxed clock models. This approach returns estimates of the posterior probability of each clock model through which one can estimate the Bayes factor in favor of the maximum a posteriori (MAP) clock model; however, this Bayes factor estimate may suffer when the posterior probability of the MAP model approaches 1. Here, we compare these two recent developments with the HME, stabilized/smoothed HME (sHME), and AICM, using both synthetic and empirical data. Our comparison shows reassuringly that MAP identification and its Bayes factor provide similar performance to PS and SS and that these approaches considerably outperform HME, sHME, and AICM in selecting the correct underlying clock model. We also illustrate the importance of using proper priors on a large set of empirical data sets.
TL;DR: This book gives the reader a thorough appreciation of asymptotics through the use of lots of practical examples and down-to-earth explanations and shows the application to statistical inference.
Abstract: and shows the application to statistical inference. Saddle point approximations such as the method of Darboux and Hayman’s approximation and application of these methods cause the reader to be an active participant rather than a passive learner. The final chapter is devoted to the summation of series and addresses methods for accelerating the speed of convergence for these methods. Probably the major goal of this book is to introduce the hows and whys of asymptotic theory which are seldom taught in the traditional asymptotic courses at the doctoral level. This book gives the reader a thorough appreciation of asymptotics through the use of lots of practical examples and down-to-earth explanations. While it may not be able to serve as an essential text, students may find it very useful as a reference book.
TL;DR: MT‐DREAM(ZS), which combines the strengths of multiple‐try sampling, snooker updating, and sampling from an archive of past states is introduced, which is especially designed to solve high‐dimensional search problems and receives particularly spectacular performance improvement over other adaptive MCMC approaches when using distributed computing.
Abstract: [1] Spatially distributed hydrologic models are increasingly being used to study and predict soil moisture flow, groundwater recharge, surface runoff, and river discharge. The usefulness and applicability of such complex models is increasingly held back by the potentially many hundreds (thousands) of parameters that require calibration against some historical record of data. The current generation of search and optimization algorithms is typically not powerful enough to deal with a very large number of variables and summarize parameter and predictive uncertainty. We have previously presented a general-purpose Markov chain Monte Carlo (MCMC) algorithm for Bayesian inference of the posterior probability density function of hydrologic model parameters. This method, entitled differential evolution adaptive Metropolis (DREAM), runs multiple different Markov chains in parallel and uses a discrete proposal distribution to evolve the sampler to the posterior distribution. The DREAM approach maintains detailed balance and shows excellent performance on complex, multimodal search problems. Here we present our latest algorithmic developments and introduce MT-DREAM(ZS), which combines the strengths of multiple-try sampling, snooker updating, and sampling from an archive of past states. This new code is especially designed to solve high-dimensional search problems and receives particularly spectacular performance improvement over other adaptive MCMC approaches when using distributed computing. Four different case studies with increasing dimensionality up to 241 parameters are used to illustrate the advantages of MT-DREAM(ZS).
TL;DR: This work addresses the solution of large-scale statistical inverse problems in the framework of Bayesian inference with a so-called Stochastic Monte Carlo method.
Abstract: We address the solution of large-scale statistical inverse problems in the framework of Bayesian inference. The Markov chain Monte Carlo (MCMC) method is the most popular approach for sampling the posterior probability distribution that describes the solution of the statistical inverse problem. MCMC methods face two central difficulties when applied to large-scale inverse problems: first, the forward models (typically in the form of partial differential equations) that map uncertain parameters to observable quantities make the evaluation of the probability density at any point in parameter space very expensive; and second, the high-dimensional parameter spaces that arise upon discretization of infinite-dimensional parameter fields make the exploration of the probability density function prohibitive. The challenge for MCMC methods is to construct proposal functions that simultaneously provide a good approximation of the target density while being inexpensive to manipulate. Here we present a so-called Stoch...
TL;DR: This work explores the idea that saccadic eye movements are optimal experiments, in which data are gathered to test hypotheses or beliefs about how those data are caused, and provides a plausible model of visual search that can be motivated from the basic principles of self-organized behavior.
Abstract: If perception corresponds to hypothesis testing (Gregory, 1980); then visual searches might be construed as experiments that generate sensory data. In this work, we explore the idea that saccadic eye movements are optimal experiments, in which data are gathered to test hypotheses or beliefs about how those data are caused. This provides a plausible model of visual search that can be motivated from the basic principles of self-organized behavior: namely, the imperative to minimize the entropy of hidden states of the world and their sensory consequences. This imperative is met if agents sample hidden states of the world efficiently. This efficient sampling of salient information can be derived in a fairly straightforward way, using approximate Bayesian inference and variational free-energy minimization. Simulations of the resulting active inference scheme reproduce sequential eye movements that are reminiscent of empirically observed saccades and provide some counterintuitive insights into the way that sensory evidence is accumulated or assimilated into beliefs about the world.
TL;DR: This work presents a novel method for joint inversion of receiver functions and surface wave dispersion data, using a transdimensional Bayesian formulation and shows that the Hierarchical Bayes procedure is a powerful tool in this situation, able to evaluate the level of information brought by different data types in the misfit, thus removing the arbitrary choice of weighting factors.
Abstract: We present a novel method for joint inversion of receiver functions and surface wave dispersion data, using a transdimensional Bayesian formulation. This class of algorithm treats the number of model parameters (e.g. number of layers) as an unknown in the problem. The dimension of the model space is variable and a Markov chain Monte Carlo (McMC) scheme is used to provide a parsimonious solution that fully quantifies the degree of knowledge one has about seismic structure (i.e constraints on the model, resolution, and trade-offs). The level of data noise (i.e. the covariance matrix of data errors) effectively controls the information recoverable from the data and here it naturally determines the complexity of the model (i.e. the number of model parameters). However, it is often difficult to quantify the data noise appropriately, particularly in the case of seismic waveform inversion where data errors are correlated. Here we address the issue of noise estimation using an extended Hierarchical Bayesian formulation, which allows both the variance and covariance of data noise to be treated as unknowns in the inversion. In this way it is possible to let the data infer the appropriate level of data fit. In the context of joint inversions, assessment of uncertainty for different data types becomes crucial in the evaluation of the misfit function. We show that the Hierarchical Bayes procedure is a powerful tool in this situation, because it is able to evaluate the level of information brought by different data types in the misfit, thus removing the arbitrary choice of weighting factors. After illustrating the method with synthetic tests, a real data application is shown where teleseismic receiver functions and ambient noise surface wave dispersion measurements from the WOMBAT array (South-East Australia) are jointly inverted to provide a probabilistic 1D model of shear-wave velocity beneath a given station.
TL;DR: This article provides a simple and intuitive derivation of the Kalman filter, with the aim of teaching this useful tool to students from disciplines that do not require a strong mathematical background.
Abstract: T his article provides a simple and intuitive derivation of the Kalman filter, with the aim of teaching this useful tool to students from disciplines that do not require a strong mathematical background. The most complicated level of mathematics required to understand this derivation is the ability to multiply two Gaussian functions together and reduce the result to a compact form. The Kalman filter is over 50 years old but is still one of the most important and common data fusion algorithms in use today. Named after Rudolf E. Kalman, the great success of the Kalman filter is due to its small computational requirement, elegant recursive properties, and its status as the optimal estimator for one-dimensional linear systems with Gaussian error statistics [1] . Typical uses of the Kalman filter include smoothing noisy data and providing estimates of parameters of interest. Applications include global positioning system receivers, phaselocked loops in radio equipment, smoothing the output from laptop trackpads, and many more. From a theoretical standpoint, the Kalman filter is an algorithm permitting exact inference in a linear dynamical system, which is a Bayesian model similar to a hidden Markov model but where the state space of the latent variables is continuous and where all latent and observed variables have a Gaussian distribution (often a multivariate Gaussian distribution). The aim of this lecture note is to permit people who find this description confusing or terrifying to understand the basis of the Kalman filter via a simple and intuitive derivation.
TL;DR: This tutorial describes the mean-field variational Bayesian approximation to inference in graphical models, using modern machine learning terminology rather than statistical physics concepts, and derives local node updates and reviews the recent Variational Message Passing framework.
Abstract: This tutorial describes the mean-field variational Bayesian approximation to inference in graphical models, using modern machine learning terminology rather than statistical physics concepts. It begins by seeking to find an approximate mean-field distribution close to the target joint in the KL-divergence sense. It then derives local node updates and reviews the recent Variational Message Passing framework.
TL;DR: A new approach to Bayesian inference is presented that entirely avoids Markov chain simulation, by constructing a map that pushes forward the prior measure to the posterior measure, and demonstrates the accuracy and efficiency of the approach on nonlinear inverse problems of varying dimension.
TL;DR: In this paper, the authors formalize the most general and compelling of the various criteria that have been suggested, together with a new criterion, and illustrate the potential of these criteria in determining objective model selection priors by considering their application to the problem of variable selection.
Abstract: In objective Bayesian model selection, no single criterion has emerged as dominant in defining objective prior distributions. Indeed, many criteria have been separately proposed and utilized to propose differing prior choices. We first formalize the most general and compelling of the various criteria that have been suggested, together with a new criterion. We then illustrate the potential of these criteria in determining objective model selection priors by considering their application to the problem of variable selection in normal linear models. This results in a new model selection objective prior with a number of compelling properties.
TL;DR: In this paper, a matrix factorization formulation and enforcing the low-rank constraint in the estimates as a sparsity constraint are used to determine the correct rank while providing high recovery performance.
Abstract: Recovery of low-rank matrices has recently seen significant activity in many areas of science and engineering, motivated by recent theoretical results for exact reconstruction guarantees and interesting practical applications. In this paper, we present novel recovery algorithms for estimating low-rank matrices in matrix completion and robust principal component analysis based on sparse Bayesian learning (SBL) principles. Starting from a matrix factorization formulation and enforcing the low-rank constraint in the estimates as a sparsity constraint, we develop an approach that is very effective in determining the correct rank while providing high recovery performance. We provide connections with existing methods in other similar problems and empirical results and comparisons with current state-of-the-art methods that illustrate the effectiveness of this approach.
TL;DR: Modifications of Bayesian model selection methods by imposing nonlocal prior densities on model parameters are proposed and it is demonstrated that these model selection procedures perform as well or better than commonly used penalized likelihood methods in a range of simulation settings.
Abstract: Standard assumptions incorporated into Bayesian model selection procedures result in procedures that are not competitive with commonly used penalized likelihood methods. We propose modifications of these methods by imposing nonlocal prior densities on model parameters. We show that the resulting model selection procedures are consistent in linear model settings when the number of possible covariates p is bounded by the number of observations n, a property that has not been extended to other model selection procedures. In addition to consistently identifying the true model, the proposed procedures provide accurate estimates of the posterior probability that each identified model is correct. Through simulation studies, we demonstrate that these model selection procedures perform as well or better than commonly used penalized likelihood methods in a range of simulation settings. Proofs of the primary theorems are provided in the Supplementary Material that is available online.
TL;DR: This tutorial explains the foundation of approximate Bayesian computation (ABC), an approach to Bayesian inference that does not require the specification of a likelihood function, and hence that can be used to estimate posterior distributions of parameters for simulation-based models.
TL;DR: In this article, the authors consider the multivariate normal mean model with sparse observations and find various combinations of priors on the number of nonzero coefficients and on these coefficients that give desirable performance.
Abstract: We consider full Bayesian inference in the multivariate normal mean model in the situation that the mean vector is sparse. The prior distribution on the vector of means is constructed hierarchically by first choosing a collection of nonzero means and next a prior on the nonzero values. We consider the posterior distribution in the frequentist set-up that the observations are generated according to a fixed mean vector, and are interested in the posterior distribution of the number of nonzero components and the contraction of the posterior distribution to the true mean vector. We find various combinations of priors on the number of nonzero coefficients and on these coefficients that give desirable performance. We also find priors that give suboptimal convergence, for instance, Gaussian priors on the nonzero coefficients. We illustrate the results by simulations.
TL;DR: The technique can handle noisy data, potentially from multiple sources, and fuse it into a robust common probabilistic representation of the robot’s surroundings, and provides inferences with associated variances into occluded regions and between sensor beams, even with relatively few observations.
Abstract: We introduce a new statistical modelling technique for building occupancy maps. The problem of mapping is addressed as a classification task where the robot's environment is classified into regions of occupancy and free space. This is obtained by employing a modified Gaussian process as a non-parametric Bayesian learning technique to exploit the fact that real-world environments inherently possess structure. This structure introduces dependencies between points on the map which are not accounted for by many common mapping techniques such as occupancy grids. Our approach is an 'anytime' algorithm that is capable of generating accurate representations of large environments at arbitrary resolutions to suit many applications. It also provides inferences with associated variances into occluded regions and between sensor beams, even with relatively few observations. Crucially, the technique can handle noisy data, potentially from multiple sources, and fuse it into a robust common probabilistic representation of the robot's surroundings. We demonstrate the benefits of our approach on simulated datasets with known ground truth and in outdoor urban environments.
TL;DR: This article describes the mechanics and rationale of four different approaches to the statistical testing of electrophysiological data: (1) the Neyman-Pearson approach, (2) the permutation-based approach,(3), the bootstrap- based approach, and (4) the Bayesian approach.
Abstract: This article describes the mechanics and rationale of four different approaches to the statistical testing of electrophysiological data: (1) the Neyman-Pearson approach, (2) the permutation-based approach, (3), the bootstrap-based approach, and (4) the Bayesian approach. These approaches are evaluated from the perspective of electrophysiological studies, which involve multivariate (i.e., spatiotemporal) observations in which source-level signals are picked up to a certain extent by all sensors. Besides formal statistical techniques, there are also techniques that do not involve probability calculations but are very useful in dealing with multivariate data (i.e., verification of data-based predictions, cross-validation, and localizers). Moreover, data-based decision making can also be informed by mechanistic evidence that is provided by the structure in the data.
TL;DR: An efficient ML DOA estimator based on a spatially overcomplete array output formulation that surpasses state-of-the-art methods largely in performance, especially in demanding scenarios such as low signal-to-noise ratio (SNR), limited snapshots and spatially adjacent signals.
Abstract: The computationally prohibitive multi-dimensional searching procedure greatly restricts the application of the maximum likelihood (ML) direction-of-arrival (DOA) estimation method in practical systems. In this paper, we propose an efficient ML DOA estimator based on a spatially overcomplete array output formulation. The new method first reconstructs the array output on a predefined spatial discrete grid under the sparsity constraint via sparse Bayesian learning (SBL), thus obtaining a spatial power spectrum estimate that also indicates the coarse locations of the sources. Then a refined 1-D searching procedure is introduced to estimate the signal directions one by one based on the reconstruction result. The new method is able to estimate the incident signal number simultaneously. Numerical results show that the proposed method surpasses state-of-the-art methods largely in performance, especially in demanding scenarios such as low signal-to-noise ratio (SNR), limited snapshots and spatially adjacent signals.
TL;DR: Bayesian model averaging as discussed by the authors is the coherent Bayesian way of combining multiple models only under certain restrictive assumptions, which is the framework for Bayesian model combination (which differs from model averaging) in the context of classification.
Abstract: Bayesian model averaging linearly mixes the probabilistic predictions of multiple models, each weighted by its posterior probability. This is the coherent Bayesian way of combining multiple models only under certain restrictive assumptions, which we outline. We explore a general framework for Bayesian model combination (which differs from model averaging) in the context of classification. This framework explicitly models the relationship between each model’s output and the unknown true label. The framework does not require that the models be probabilistic (they can even be human assessors), that they share prior information or receive the same training data, or that they be independent in their errors. Finally, the Bayesian combiner does not need to believe any of the models is in fact correct. We test several variants of this classifier combination procedure starting from a classic statistical model proposed by Dawid and Skene (1979) and using MCMC to add more complex but important features to the model. Comparisons on several data sets to simpler methods like majority voting show that the Bayesian methods not only perform well but result in interpretable diagnostics on the data points and the models.
TL;DR: The experimental results show that the proposed algorithm outperforms many state-of-the-art algorithms, and solves the inverse problem automatically-prior information on the number of clusters and the size of each cluster is unknown.
TL;DR: It was found that SCEM-UA and AMALGAM produce results quicker than GLUE in terms of required number of simulations, but modellers should select the method which is most suitable for the system they are modelling, as GLUE requires the lowest modelling skills and is easy to implement.
TL;DR: This study failed to replicate previous findings in that subjects' accuracy was remarkably lower and visualizations exhibited no measurable benefit, but suggests that visualizations are more effective when the text is given without numerical values.
Abstract: People have difficulty understanding statistical information and are unaware of their wrong judgments, particularly in Bayesian reasoning. Psychology studies suggest that the way Bayesian problems are represented can impact comprehension, but few visual designs have been evaluated and only populations with a specific background have been involved. In this study, a textual and six visual representations for three classic problems were compared using a diverse subject pool through crowdsourcing. Visualizations included area-proportional Euler diagrams, glyph representations, and hybrid diagrams combining both. Our study failed to replicate previous findings in that subjects' accuracy was remarkably lower and visualizations exhibited no measurable benefit. A second experiment confirmed that simply adding a visualization to a textual Bayesian problem is of little help, even when the text refers to the visualization, but suggests that visualizations are more effective when the text is given without numerical values. We discuss our findings and the need for more such experiments to be carried out on heterogeneous populations of non-experts.
TL;DR: In this article, the explicit-duration Hierarchical Dirichlet Process Hidden Semi-Markov Model (HDP-HSMM) is proposed to learn non-geometric state durations.
Abstract: There is much interest in the Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM) as a natural Bayesian nonparametric extension of the ubiquitous Hidden Markov Model for learning from sequential and time-series data. However, in many settings the HDP-HMM's strict Markovian constraints are undesirable, particularly if we wish to learn or encode non-geometric state durations. We can extend the HDP-HMM to capture such structure by drawing upon explicit-duration semi-Markovianity, which has been developed mainly in the parametric frequentist setting, to allow construction of highly interpretable models that admit natural prior information on state durations.
In this paper we introduce the explicit-duration Hierarchical Dirichlet Process Hidden semi-Markov Model (HDP-HSMM) and develop sampling algorithms for efficient posterior inference. The methods we introduce also provide new methods for sampling inference in the finite Bayesian HSMM. Our modular Gibbs sampling methods can be embedded in samplers for larger hierarchical Bayesian models, adding semi-Markov chain modeling as another tool in the Bayesian inference toolbox. We demonstrate the utility of the HDP-HSMM and our inference methods on both synthetic and real experiments.
TL;DR: In this article, an extension of the Lindley distribution for lifetime data is proposed. But the authors focus on the failure rate of lifetime data and do not consider the residual lifetime of the data.
Abstract: In this paper we introduce an extension of the Lindley distribution which offers a more flexible model for lifetime data. Several statistical properties of the distribution are explored, such as the density, (reversed) failure rate, (reversed) mean residual lifetime, moments, order statistics, Bonferroni and Lorenz curves. Estimation using the maximum likelihood and inference of a random sample from the distribution are investigated. A real data application illustrates the performance of the distribution.