Top 145 papers published in the topic of Bayesian inference in 1997

Showing papers on "Bayesian inference published in 1997"

Markov Chain Monte Carlo in Practice

[...]

Walter R. Gilks¹, Sylvia Richardson, David Spiegelhalter•Institutions (1)

01 Aug 1997-Technometrics

TL;DR: The Markov Chain Monte Carlo Implementation Results Summary and Discussion MEDICAL MONITORING Introduction Modelling Medical Monitoring Computing Posterior Distributions Forecasting Model Criticism Illustrative Application Discussion MCMC for NONLINEAR HIERARCHICAL MODELS.

...read moreread less

Abstract: INTRODUCING MARKOV CHAIN MONTE CARLO Introduction The Problem Markov Chain Monte Carlo Implementation Discussion HEPATITIS B: A CASE STUDY IN MCMC METHODS Introduction Hepatitis B Immunization Modelling Fitting a Model Using Gibbs Sampling Model Elaboration Conclusion MARKOV CHAIN CONCEPTS RELATED TO SAMPLING ALGORITHMS Markov Chains Rates of Convergence Estimation The Gibbs Sampler and Metropolis-Hastings Algorithm INTRODUCTION TO GENERAL STATE-SPACE MARKOV CHAIN THEORY Introduction Notation and Definitions Irreducibility, Recurrence, and Convergence Harris Recurrence Mixing Rates and Central Limit Theorems Regeneration Discussion FULL CONDITIONAL DISTRIBUTIONS Introduction Deriving Full Conditional Distributions Sampling from Full Conditional Distributions Discussion STRATEGIES FOR IMPROVING MCMC Introduction Reparameterization Random and Adaptive Direction Sampling Modifying the Stationary Distribution Methods Based on Continuous-Time Processes Discussion IMPLEMENTING MCMC Introduction Determining the Number of Iterations Software and Implementation Output Analysis Generic Metropolis Algorithms Discussion INFERENCE AND MONITORING CONVERGENCE Difficulties in Inference from Markov Chain Simulation The Risk of Undiagnosed Slow Convergence Multiple Sequences and Overdispersed Starting Points Monitoring Convergence Using Simulation Output Output Analysis for Inference Output Analysis for Improving Efficiency MODEL DETERMINATION USING SAMPLING-BASED METHODS Introduction Classical Approaches The Bayesian Perspective and the Bayes Factor Alternative Predictive Distributions How to Use Predictive Distributions Computational Issues An Example Discussion HYPOTHESIS TESTING AND MODEL SELECTION Introduction Uses of Bayes Factors Marginal Likelihood Estimation by Importance Sampling Marginal Likelihood Estimation Using Maximum Likelihood Application: How Many Components in a Mixture? Discussion Appendix: S-PLUS Code for the Laplace-Metropolis Estimator MODEL CHECKING AND MODEL IMPROVEMENT Introduction Model Checking Using Posterior Predictive Simulation Model Improvement via Expansion Example: Hierarchical Mixture Modelling of Reaction Times STOCHASTIC SEARCH VARIABLE SELECTION Introduction A Hierarchical Bayesian Model for Variable Selection Searching the Posterior by Gibbs Sampling Extensions Constructing Stock Portfolios With SSVS Discussion BAYESIAN MODEL COMPARISON VIA JUMP DIFFUSIONS Introduction Model Choice Jump-Diffusion Sampling Mixture Deconvolution Object Recognition Variable Selection Change-Point Identification Conclusions ESTIMATION AND OPTIMIZATION OF FUNCTIONS Non-Bayesian Applications of MCMC Monte Carlo Optimization Monte Carlo Likelihood Analysis Normalizing-Constant Families Missing Data Decision Theory Which Sampling Distribution? Importance Sampling Discussion STOCHASTIC EM: METHOD AND APPLICATION Introduction The EM Algorithm The Stochastic EM Algorithm Examples GENERALIZED LINEAR MIXED MODELS Introduction Generalized Linear Models (GLMs) Bayesian Estimation of GLMs Gibbs Sampling for GLMs Generalized Linear Mixed Models (GLMMs) Specification of Random-Effect Distributions Hyperpriors and the Estimation of Hyperparameters Some Examples Discussion HIERARCHICAL LONGITUDINAL MODELLING Introduction Clinical Background Model Detail and MCMC Implementation Results Summary and Discussion MEDICAL MONITORING Introduction Modelling Medical Monitoring Computing Posterior Distributions Forecasting Model Criticism Illustrative Application Discussion MCMC FOR NONLINEAR HIERARCHICAL MODELS Introduction Implementing MCMC Comparison of Strategies A Case Study from Pharmacokinetics-Pharmacodynamics Extensions and Discussion BAYESIAN MAPPING OF DISEASE Introduction Hypotheses and Notation Maximum Likelihood Estimation of Relative Risks Hierarchical Bayesian Model of Relative Risks Empirical Bayes Estimation of Relative Risks Fully Bayesian Estimation of Relative Risks Discussion MCMC IN IMAGE ANALYSIS Introduction The Relevance of MCMC to Image Analysis Image Models at Different Levels Methodological Innovations in MCMC Stimulated by Imaging Discussion MEASUREMENT ERROR Introduction Conditional-Independence Modelling Illustrative examples Discussion GIBBS SAMPLING METHODS IN GENETICS Introduction Standard Methods in Genetics Gibbs Sampling Approaches MCMC Maximum Likelihood Application to a Family Study of Breast Cancer Conclusions MIXTURES OF DISTRIBUTIONS: INFERENCE AND ESTIMATION Introduction The Missing Data Structure Gibbs Sampling Implementation Convergence of the Algorithm Testing for Mixtures Infinite Mixtures and Other Extensions AN ARCHAEOLOGICAL EXAMPLE: RADIOCARBON DATING Introduction Background to Radiocarbon Dating Archaeological Problems and Questions Illustrative Examples Discussion Index

...read moreread less

8,444 citations

Journal Article•10.1080/01621459.1997.10473615•

Bayesian Model Averaging for Linear Regression Models

[...]

Adrian E. Raftery¹, David Madigan¹, Jennifer A. Hoeting²•Institutions (2)

University of Washington¹, Colorado State University²

01 Mar 1997-Journal of the American Statistical Association

TL;DR: In this paper, the authors consider the problem of accounting for model uncertainty in linear regression models and propose two alternative approaches: the Occam's window approach and the Markov chain Monte Carlo approach.

...read moreread less

Abstract: We consider the problem of accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. A Bayesian solution to this problem involves averaging over all possible models (i.e., combinations of predictors) when making inferences about quantities of interest. This approach is often not practical. In this article we offer two alternative approaches. First, we describe an ad hoc procedure, “Occam's window,” which indicates a small set of models over which a model average can be computed. Second, we describe a Markov chain Monte Carlo approach that directly approximates the exact solution. In the presence of model uncertainty, both of these model averaging procedures provide better predictive performance than any single model that might reasonably have been selected. In the extreme case where there are many candidate predictors but ...

...read moreread less

2,036 citations

Proceedings Article•10.1109/ICNN.1997.614194•

Gauss-Newton approximation to Bayesian learning

[...]

F. Dan Foresee¹, Martin T. Hagan²•Institutions (2)

Alcatel-Lucent¹, Oklahoma State University–Stillwater²

9 Jun 1997

TL;DR: The application of Bayesian regularization to the training of feedforward neural networks is described, using a Gauss-Newton approximation to the Hessian matrix to reduce the computational overhead.

...read moreread less

Abstract: This paper describes the application of Bayesian regularization to the training of feedforward neural networks. A Gauss-Newton approximation to the Hessian matrix, which can be conveniently implemented within the framework of the Levenberg-Marquardt algorithm, is used to reduce the computational overhead. The resulting algorithm is demonstrated on a simple test problem and is then applied to three practical problems. The results demonstrate that the algorithm produces networks which have excellent generalization capabilities.

...read moreread less

1,580 citations

Journal Article•10.1214/AOS/1034276631•

Bayesian inference for causal effects in randomized experiments with noncompliance

[...]

Guido W. Imbens, Donald B. Rubin

01 Feb 1997-Annals of Statistics

TL;DR: In this article, Bayesian inferential methods for causal estimands in the presence of noncompliance are presented, where the binary treatment assignment is random and hence ignorable, but the treatment received is not ignorable.

...read moreread less

Abstract: For most of this century, randomization has been a cornerstone of scientific experimentation, especially when dealing with humans as experimental units. In practice, however, noncompliance is relatively common with human subjects, complicating traditional theories of inference that require adherence to the random treatment assignment. In this paper we present Bayesian inferential methods for causal estimands in the presence of noncompliance, when the binary treatment assignment is random and hence ignorable, but the binary treatment received is not ignorable. We assume that both the treatment assigned and the treatment received are observed. We describe posterior estimation using EM and data augmentation algorithms. Also, we investigate the role of two assumptions often made in econometric instrumental variables analyses, the exclusion restriction and the monotonicity assumption, without which the likelihood functions generally have substantial regions of maxima. We apply our procedures to real and artificial data, thereby demonstrating the technology and showing that our new methods can yield valid inferences that differ in practically important ways from those based on previous methods for analysis in the presence of noncompliance, including intention-to-treat analyses and analyses based on econometric instrumental variables techniques. Finally, we perform a simulation to investigate the operating characteristics of the competing procedures in a simple setting, which indicates relatively dramatic improvements in frequency operating characteristics attainable using our Bayesian procedures.

...read moreread less

605 citations

Journal Article•10.1080/01621459.1997.10474044•

Practical Bayesian Density Estimation Using Mixtures of Normals

[...]

Kathryn Roeder¹, Larry Wasserman¹•Institutions (1)

Carnegie Mellon University¹

01 Sep 1997-Journal of the American Statistical Association

TL;DR: In this paper, the posterior for the number of components in a mixture of normals is not well defined, and posterior simulation does not provide a direct estimate of the posterior of the components in the mixture.

...read moreread less

Abstract: Mixtures of normals provide a flexible model for estimating densities in a Bayesian framework. There are some difficulties with this model, however. First, standard reference priors yield improper posteriors. Second, the posterior for the number of components in the mixture is not well defined (if the reference prior is used). Third, posterior simulation does not provide a direct estimate of the posterior for the number of components. We present some practical methods for coping with these problems. Finally, we give some results on the consistency of the method when the maximum number of components is allowed to grow with the sample size.

...read moreread less

590 citations

Journal Article•10.1023/A:1007327622663•

A Bayesian/Information Theoretic Model of Learning to Learn viaMultiple Task Sampling

[...]

Jonathan Baxter¹•Institutions (1)

London School of Economics and Political Science¹

01 Jul 1997-Machine Learning

TL;DR: It is argued that for many common machine learning problems, although in general the authors do not know the true (objective) prior for the problem, they do have some idea of a set of possible priors to which the true prior belongs.

...read moreread less

Abstract: A Bayesian model of learning to learn by sampling from multiple tasks is presented. The multiple tasks are themselves generated by sampling from a distribution over an environment of related tasks. Such an environment is shown to be naturally modelled within a Bayesian context by the concept of an objective prior distribution. It is argued that for many common machine learning problems, although in general we do not know the true (objective) prior for the problem, we do have some idea of a set of possible priors to which the true prior belongs. It is shown that under these circumstances a learner can use Bayesian inference to learn the true prior by learning sufficiently many tasks from the environment. In addition, bounds are given on the amount of information required to learn a task when it is simultaneously learnt with several other tasks. The bounds show that if the learner has little knowledge of the true prior, but the dimensionality of the true prior is small, then sampling multiple tasks is highly advantageous. The theory is applied to the problem of learning a common feature set or equivalently a low-dimensional-representation (LDR) for an environment of related tasks.

...read moreread less

568 citations

Journal Article•10.1111/1467-9876.00082•

Bayesian Model Averaging in Proportional Hazard Models: Assessing the Risk of a Stroke

[...]

Chris Volinsky¹, David Madigan¹, Adrian E. Raftery¹, Richard A. Kronmal¹•Institutions (1)

University of Washington¹

01 Jan 1997-Journal of The Royal Statistical Society Series C-applied Statistics

TL;DR: For the Cardiovascular Health Study, Bayesian model averaging predictively outperforms standard model selection and does a better job of assessing who is at high risk for a stroke.

...read moreread less

Abstract: SUMMARY In the context of the Cardiovascular Health Study, a comprehensive investigation into the risk factors for strokes, we apply Bayesian model averaging to the selection of variables in Cox proportional hazard models. We use an extension of the leaps-and-bounds algorithm for locating the models that are to be averaged over and make available S-PLUS software to implement the methods. Bayesian model averaging provides a posterior probability that each variable belongs in the model, a more directly interpretable measure of variable importance than a P-value. P-values from models preferred by stepwise methods tend to overstate the evidence for the predictive value of a variable and do not account for model uncertainty. We introduce the partial predictive score to evaluate predictive performance. For the Cardiovascular Health Study, Bayesian model averaging predictively outperforms standard model selection and does a better

...read moreread less

274 citations

Journal Article•10.1016/S0304-4076(97)88050-5•

On the use of panel data in stochastic frontier models with improper priors

[...]

Carmen Fernandez¹, Jacek Osiewalski, Mark F. J. Steel¹•Institutions (1)

Tilburg University¹

01 Jul 1997-Journal of Econometrics

TL;DR: In this article, a Bayesian analysis of the stochastic frontier model with composed error is presented, and the existence of the posterior distribution and posterior moments is examined under a commonly used class of (partly) noninformative prior distributions.

...read moreread less

173 citations

Journal Article•10.1016/S1364-8152(97)00008-X•

Bayesian decision analysis for environmental and resource management

[...]

Olli Varis¹•Institutions (1)

Helsinki University of Technology¹

01 Jan 1997-Environmental Modelling and Software

TL;DR: This paper documents and discusses experience on the use of two recent network model approaches, influence diagrams and belief networks, and relates those approaches to decision trees.

...read moreread less

Abstract: During the last two decades, much of the theoretical and practical advances in Bayesian decision analysis have been closely linked to the adaptation of emerging new computational — usually Artificial Intelligence — techniques and to progress in computer software, respectively. This paper documents and discusses experience on the use of two recent network model approaches, influence diagrams and belief networks, and relates those approaches to decision trees. They both allow probabilistic, Bayesian studies with classical decision analytic concepts such as risk attitude analysis, value of information and control, multi-attribute analysis, and various structural analyses. The theory of influence diagrams dates back to the early 1980s, and a variety of commercial software are on market. Belief network is a more recent concept that is under process of finding its way to applications. Illustration on environmental and resource management is provided with examples on freshwater and fisheries studies.

...read moreread less

129 citations

Journal Article•10.1109/89.554778•

On-line adaptive learning of the continuous density hidden Markov model based on approximate recursive Bayes estimate

[...]

Qiang Huo, Chin-Hui Lee

01 Mar 1997-IEEE Transactions on Speech and Audio Processing

TL;DR: A framework of quasi-Bayes (QB) learning of the parameters of the continuous density hidden Markov model (CDHMM) with Gaussian mixture state observation densities with simple forgetting mechanism to adjust the contribution of previously observed sample utterances is presented.

...read moreread less

Abstract: We present a framework of quasi-Bayes (QB) learning of the parameters of the continuous density hidden Markov model (CDHMM) with Gaussian mixture state observation densities. The QB formulation is based on the theory of recursive Bayesian inference. The QB algorithm is designed to incrementally update the hyperparameters of the approximate posterior distribution and the CDHMM parameters simultaneously. By further introducing a simple forgetting mechanism to adjust the contribution of previously observed sample utterances, the algorithm is adaptive in nature and capable of performing an online adaptive learning using only the current sample utterance. It can, thus, be used to cope with the time-varying nature of some acoustic and environmental variabilities, including mismatches caused by changing speakers, channels, and transducers. As an example, the QB learning framework is applied to on-line speaker adaptation and its viability is confirmed in a series of comparative experiments using a 26-letter English alphabet vocabulary.

...read moreread less

125 citations

Proceedings Article•

Why does bagging work? a Bayesian account and its implications

[...]

Pedro Domingos¹•Institutions (1)

University of California, Irvine¹

14 Aug 1997

TL;DR: This paper empirically test two alternative explanations for why bagging works: it is an approximation to the optimal procedure of Bayesian model averaging, with an appropriate implicit prior, and it effectively shifts the prior to a more appropriate region of model space.

...read moreread less

Abstract: The error rate of decision-tree and other classification learners can often be much reduced by bagging: learning multiple models from bootstrap samples of the database, and combining them by uniform voting. In this paper we empirically test two alternative explanations for this, both based on Bayesian learning theory: (1) bagging works because it is an approximation to the optimal procedure of Bayesian model averaging, with an appropriate implicit prior; (2) bagging works because it effectively shifts the prior to a more appropriate region of model space. All the experimental evidence contradicts the first hypothesis, and confirms the second.

...read moreread less

Journal Article•10.1111/1467-9884.00084•

Ranking and selecting motor vehicle accident sites by using a hierarchical Bayesian model

[...]

Philip J. Schluter¹, J J Deely¹, Alan Nicholson¹•Institutions (1)

University of Canterbury¹

01 Sep 1997-The Statistician

TL;DR: The particular hierarchical Bayesian approach that is used incorporates expert knowledge about accident sites as a group believed a priori to be exchangeable, the Poisson assumption and a conjugate gamma prior, and three natural strategies for ranking and selecting the most hazardous subgroup of accident locations.

...read moreread less

Abstract: Identification, ranking and selecting hazardous traffic accident locations from a group under consideration is a fundamental goal for traffic safety researchers. Few methods exist that can quantitatively, accurately and easily discriminate between sites that commonly have small and variable observation count periods. One method that embodies all these advantages is the hierarchical Bayesian model, the method proposed in this paper. The particular hierarchical Bayesian approach that we use incorporates expert knowledge about accident sites as a group believed a priori to be exchangeable, the Poisson assumption and a conjugate gamma prior. We then propose three natural strategies for ranking and selecting the most hazardous subgroup of accident locations. Also presented is an especially useful procedure that gives the probability that each particular site is worst and by how much it is worst. All proposed strategies are illustrated using previously published fatality accident data from 35 sites in Auckland, New Zealand.

...read moreread less

Journal Article•10.1111/1467-9469.00062•

Inference and Missing Data: Asymptotic Results

[...]

Søren Feodor Nielsen¹•Institutions (1)

University of Copenhagen¹

01 Jun 1997-Scandinavian Journal of Statistics

TL;DR: In this paper, it is shown that (a slightly strengthened version of) the MAR condition is sufficient to yield ordinary large sample results for estimators and test statistics and thus may be used for (asymptotic) frequentist inference.

...read moreread less

Abstract: In Rubin (1976) the missing at random (MAR) and missing completely at random (MCAR) conditions are discussed. It is concluded that the MAR condition allows one to ignore the missing data mechanism when doing likelihood or Bayesian inference but also that the stronger MCAR condition is in some sense the weakest generally sufficient condition allowing (conditional) frequentist inference while ignoring the missing data mechanism. In this paper it is shown that (a slightly strengthened version of) the MAR condition is sufficient to yield ordinary large sample results for estimators and test statistics and thus may be used for (asymptotic) frequentist inference.

...read moreread less

Book Chapter•10.1007/978-3-7091-2670-7_37•

Towards a Bayesian Model for Keyhole Plan Recognition in Large Domains

[...]

David W. Albrecht¹, Ingrid Zukerman¹, Ann E. Nicholson¹, Ariel E. Bud¹•Institutions (1)

Monash University¹

1 Jan 1997

TL;DR: Experimental results of the application of the approach to keyhole plan recognition which uses a Dynamic Belief Network to represent features of the domain that are needed to identify users’ plans and goals to indicate that this approach will work in other domains with similar features.

...read moreread less

Abstract: We present an approach to keyhole plan recognition which uses a Dynamic Belief Network to represent features of the domain that are needed to identify users’ plans and goals The structure of this network was determined from analysis of the domain The conditional probability distributions are learned during a training phase, which dynamically builds these probabilities from observations of user behaviour This approach allows the use of incomplete, sparse and noisy data during both training and testing We present experimental results of the application of our system to a Multi-User Dungeon adventure game with thousands of possible actions and positions These results show a high degree of predictive accuracy and indicate that this approach will work in other domains with similar features

...read moreread less

Journal Article•10.1006/GMIP.1997.0443•

On digital mammogram segmentation and microcalcification detection using multiresolution wavelet analysis

[...]

C. H. Chen¹, Gwo Giun Lee¹•Institutions (1)

University of Massachusetts Dartmouth¹

01 Nov 1997-Graphical Models and Image Processing

TL;DR: A multiresolution wavelet analysis (MWA) and nonstationary Gaussian Markov random field (GMRF) technique is introduced for the detection of microcalcifications with high accuracy and the approach has been tested with a number of mammographic images.

...read moreread less

Proceedings Article•10.1109/ICSMC.1997.635349•

Automatic document classification based on probabilistic reasoning: model and performance analysis

[...]

Wai Lam¹, Kon-Fan Low•Institutions (1)

The Chinese University of Hong Kong¹

12 Oct 1997

TL;DR: A new approach to test classification based on automatic feature extraction and probabilistic reasoning and a Bayesian network text classifier is developed, which is automatically constructed from a set of training test documents.

...read moreread less

Abstract: We develop a new approach to test classification based on automatic feature extraction and probabilistic reasoning. The knowledge representation used to perform such task is known as Bayesian inference networks. A Bayesian network text classifier is automatically constructed from a set of training test documents. We have conducted a series of experiments on two text document corpus, namely the CACM and Reuters, to analyze the performance of our approach, which are described in the paper.

...read moreread less

Journal Article•

Using Bayesian statistics, Thematic Mapper satellite imagery, and breeding bird survey data to model bird species probability of occurrence in Maine

[...]

Jeffrey A. Hepinstall, Steven A. Sader

01 Jan 1997-Photogrammetric Engineering and Remote Sensing

TL;DR: In this paper, a Bayesian modeling technique was used to predict probability of occurrence for 14 species of Maine land birds using spectral data from the Landsat Thematic Mapper bands 4 and 5.

...read moreread less

Abstract: A Bayesian modeling technique was used to predict probability of occurrence for 14 species of Maine land birds. The relationships between bird species survey data and the spectral values of Landsat Thematic Mapper bands 4 and 5 as well as a derived texture measure were used to build conditional probabilities for input into Bayes' Theorem. The conditional probabilities form decision rules for reclassifying the input spectral data into probability of occurrence estimates with associated estimates of error inherent in the model prediction. This methodology removed the costly and time-consuming step of creating a habitat map before modeling species occurrence. The output resolution of the species predictions is not degraded from the original 30-m TM pixel size to the coarse resolution of the wildlife survey data. Model results can be compared to results from other habitat modeling techniques and used by natural resource managers to predict the effects of land-use changes on available habitat.

...read moreread less

Journal Article•10.1080/00031305.1997.10473967•

On the Efficacy of Bayesian Inference for Nonidentifiable Models

[...]

Andrew A. Neath¹, Francisco J. Samaniego²•Institutions (2)

Southern Illinois University Edwardsville¹, University of California, Davis²

01 Aug 1997-The American Statistician

TL;DR: In this paper, the question of whether, and when, the Bayesian approach produces worthwhile answers is investigated conditionally, given the information provided by the experiment, and an important initial insight on the matter is that posterior estimates of a non-identifiable parameter can actually be inferior to the prior (no-data) estimate of that parameter, even as the sample size grows to infinity.

...read moreread less

Abstract: Although classical statistical methods are inapplicable in point estimation problems involving nonidentifiable parameters, a Bayesian analysis using proper priors can produce a closed form, interpretable point estimate in such problems. The question of whether, and when, the Bayesian approach produces worthwhile answers is investigated. In contrast to the preposterior analysis of this question offered by Kadane, we examine the question conditionally, given the information provided by the experiment. An important initial insight on the matter is that posterior estimates of a nonidentifiable parameter can actually be inferior to the prior (no-data) estimate of that parameter, even as the sample size grows to infinity. In general, our goal is to characterize, within the space of prior distributions, classes of priors that lead to posterior estimates that are superior, in some reasonable sense, to one's prior estimate. This goal is shown to be feasible through a detailed examination of a particular t...

...read moreread less

Journal Article•10.1088/0266-5611/13/2/009•

Resolution of seismic waveform inversion: Bayes versus Occam

[...]

Wences Gouveia, John A. Scales

01 Apr 1997-Inverse Problems

TL;DR: In this article, the Earth's subsurface elastic properties from reflection seismic data are estimated using a posteriori probabilistic information about models, which may very well include features in the null space of the forward problem.

...read moreread less

Abstract: In Bayesian inference, probabilistic information about models is posited a priori. This information, which may very well include features in the null space of the forward problem, affects both the computed models and the resulting resolution estimates. In Occam's inversion, on the other hand, the goal is to construct the smoothest model consistent with the data. This is not to say that one believes a priori that models are really smooth, but rather that a more conservative interpretation of the data ought to be made by eliminating features of the model that are not required to fit the data. The length scale associated with the smoothing is an indirect measure of resolution. In some cases the mathematical machinery of Bayesian inference resembles that of Occam's inversion, but the goals and interpretations of the two methods are rather different. To understand better the similarities and differences of these two approaches, we show an application of both methods to the problem of inferring the Earth's subsurface elastic properties from reflection seismic data. On the one hand, by deriving a priori information about the Earth's layering from fine-scale borehole measurements, coupled with information about the noise in the data and the elastic forward modelling operator, we are able to compute the Bayesian a posteriori probability distribution on the space of models. Models pseudo-randomly simulated from this a posteriori probability will exhibit features that are implied by the a priori information as well as the data, even if the former are not well resolved by the data. Then we solve the Occam's inversion problem by determining the maximum model smoothness that allows for the data to be fit, without incorporating a priori information about the models. In this case we estimate the resolution in terms of the degree of model smoothness implied by the data. The main conclusions for the numerical experiments considered in this work are that the subsurface models derived from both techniques are quite similar but error estimates associated with such models are rather different, reflecting the role of the a priori information in the inverse calculation.

...read moreread less

Journal Article•10.1016/S0167-8655(97)00117-7•

Multiple graph matching with Bayesian inference

[...]

Mark L. Williams¹, Richard C. Wilson², Edwin R. Hancock²•Institutions (2)

Defence Research Agency¹, University of York²

01 Nov 1997-Pattern Recognition Letters

TL;DR: The Bayesian framework is used to construct an inference matrix which can be used to gauge the mutual consistency of multiple graph-matches and is realised as an iterative discrete relaxation process which aims to maximise the elements of the inference matrix.

...read moreread less

Book Chapter•10.1214/LNMS/1215454142•

Bayes factors for intrinsic and fractional priors in nested models. Bayesian robustness

[...]

Elías Moreno

1 Jan 1997

Journal Article•10.1080/00031305.1997.10473972•

Bayes for Beginners? Some Reasons to Hesitate

[...]

David S. Moore¹•Institutions (1)

Purdue University¹

01 Aug 1997-The American Statistician

TL;DR: It is argued that it is, at best, premature to teach the ideas and methods of Bayesian inference in a first statistics course for general students and might well impede the trend toward experience with real data and a better balance among data analysis, data production, and inference in first statistics courses.

...read moreread less

Abstract: Is it reasonable to teach the ideas and methods of Bayesian inference in a first statistics course for general students? This paper argues that it is, at best, premature to do so. Surveys of the statistical methods actually in use suggest that Bayesian techniques are little used. Moreover, Bayesians have not yet agreed on standard approaches to standard problem settings. Bayesian reasoning requires a grasp of conditional probability, a concept confusing to beginners. Finally, an emphasis on Bayesian inference might well impede the trend toward experience with real data and a better balance among data analysis, data production, and inference in first statistics courses.

...read moreread less

Journal Article•10.1049/IP-VIS:19971093•

Model Order Selection For The Singular Value Decomposition And The Discrete Karhunen-Loeve Transform Using A Bayesian Approach

[...]

J.J. Rajan¹, P.J.W. Rayner¹•Institutions (1)

University of Cambridge¹

1 Apr 1997

TL;DR: Bayesian model order selection is considered in relation to the singular value decomposition (SVD) and the discrete Karhunen-Loeve transform (DKLT) and results that illustrate the usefulness of the method are included.

...read moreread less

Abstract: Bayesian model order selection is considered in relation to the singular value decomposition (SVD) and the discrete Karhunen–Loeve transform (DKLT). There are many applications of the SVD and DKLT where it is necessary to discard some of the small singular values that may represent corrupted signal information. Often this task is performed heuristically or in an ad hoc manner. The Bayesian approach to model order selection involves the determination of the evidence or the conditional posterior probability of the model structure given the data; this framework allows the relative probabilities of all possible candidate models to be compared explicitly. Applied to the SVD, the evidence formulation enables the number of nonzero singular values (and hence the effective rank) of a singular or ill-conditioned matrix to be determined analytically. For the DKLT, the evidence allows the determination of the optimal number of basis vectors to choose for the signal reconstruction. In addition, the Bayesian method allows prior information such as physical smoothness constraints to be incorporated directly into the problem specification. Derivations of the evidence formulae are included along with results that illustrate the usefulness of the method.

...read moreread less

Automatic target recognition using high-resolution radar range-profiles

[...]

Joseph A. O'Sullivan, Steven P. Jacobs

1 Jan 1997

TL;DR: Simulations are presented in which each of the sensor models is combined with a constant orientation rate model for the target dynamics to produce an algorithm for joint tracking and recognition using HRR data, and the algorithm using the conditionally Gaussian model achieves superior performance while requiring significantly less memory.

...read moreread less

Abstract: Recognition of aircraft and ground targets from high resolution radar (HRR) range-profiles is a notoriously difficult problem, due in large part to the extreme variability in the data for small changes in target orientation. To achieve recognition in the presence of this variability, the problem is posed in the context of joint tracking and recognition of a target using a sequence of observed HRR range-profiles. The likelihood function for the scene configuration combines a dynamics-based prior on the sequence of target orientations with a likelihood for range-profiles given the target orientation. The recognition system performs joint Bayesian inference on the target type parameter and the sequence of target orientations at the observation times. Successful recognition is critically dependent on an appropriate model for the HRR range-profiles. A deterministic model and a conditionally Gaussian model are introduced, and the likelihood functions under each model for varying orientations and target types are compared. The comparison is extended to include both aircraft and ground targets, different radar frequency bands, and full polarimetric range-profiles. The results of these comparisons show better performance for the conditionally Gaussian model in terms of the potential for correct recognition when the orientation estimate has significant error. Fundamental limits on the performance of estimators of target orientation are obtained for the two models in terms of a Hilbert-Schmidt lower bound on the expected errors. The bound is evaluated for each model using simulated data as a function of the intensity of the noise in the observations. This analysis provides a specific criterion for model selection for this problem. Simulations are presented in which each of the sensor models is combined with a constant orientation rate model for the target dynamics to produce an algorithm for joint tracking and recognition using HRR data. Results from the simulations show the performance of the algorithm in the presence of additive noise, including the expected angular estimation error and the probability of correct recognition. The algorithm using the conditionally Gaussian model achieves superior performance while requiring significantly less memory.

...read moreread less

Proceedings Article•10.1145/267460.267507•

Distributed cooperative Bayesian learning strategies

[...]

Kenji Yamanishi¹•Institutions (1)

NEC¹

1 Jul 1997

TL;DR: The effectiveness of DCB is demonstrated in the sense that for some probability models, it performs approximately as well as the non-distributed optimal Bayesian strategy for polynomial agent size and sample size, achieving a significant speed-up of learning.

...read moreread less

Abstract: This paper addresses the issue of designing an effective distributed learning system in which a number of agent learners estimate the parameter specifying the target probability density in parallel and the population learner (for short, the p-learner) combines their outputs to obtain a significantly better estimate. Such a system is important in speeding up learning. We propose as distributed learning systems two types of the distributed cooperative Bayesian learning strategies (DCB), in which each agent learner or the p-learner employs a probabilistic version of the Gibbs algorithm. We analyze DCBs by giving upper bounds on their average logarithmic losses for predicting probabilities of unseen data as functions of the sample size and the population size. We thereby demonstrate the effectiveness of DCBs by showing that for some probability models, they work approximately (or sometimes exactly) as well as the nondistributed optimal Bayesian strategy, achieving a significant speed-up of learning over it. We also consider the case where the hypothesis class of probability densities is hierarchically parameterized, and there is a feedback of information from the p-learner to agent learners. In this case we propose another type of DCB based on the Markov chain Monte Carlo method, which we abbreviate as HDCB, and characterize its average prediction loss in terms of the number of feedback iterations as well as the population size and the sample size. We thereby demonstrate that for the class of hierarchical Gaussian distributions HDCB works approximately as well as the nondistributed optimal Bayesian strategy, achieving a significant speed-up of learning over it.

...read moreread less

Journal Article•10.2307/3315344•

Alternative Bayes factors for model selection

[...]

Fulvio De Santis¹, Fulvio Spezzaferri¹•Institutions (1)

Sapienza University of Rome¹

01 Dec 1997-Canadian Journal of Statistics-revue Canadienne De Statistique

TL;DR: In this article, the authors compare several alternative Bayes factors for the problem of testing the point null hypothesis in the univariate normal model, using usual classes of priors, and conclude that the fractional Bayes factor, originally introduced to cope with improper priors is also useful in robust analysis.

...read moreread less

Abstract: Several alternative Bayes factors have been recently proposed in order to solve the problem of the extreme sensitivity of the Bayes factor to the priors of models under comparison. Specifically, the impossibility of using the Bayes factor with standard noninformative priors for model comparison has led to the introduction of new automatic criteria, such as the posterior Bayes factor (Aitkin 1991), the intrinsic Bayes factors (Berger and Pericchi 1996b) and the fractional Bayes factor (O'Hagan 1995). We derive some interesting properties of the fractional Bayes factor that provide justifications for its use additional to the ones given by O'Hagan. We further argue that the use of the fractional Bayes factor, originally introduced to cope with improper priors, is also useful in a robust analysis. Finally, using usual classes of priors, we compare several alternative Bayes factors for the problem of testing the point null hypothesis in the univariate normal model. De nombreux facteurs de Bayes alternatifs ont ete recemment proposes afin de resoudre le probleme de I'extreme sensibilite du facteur de Bayes par rapport aux distributions a-priori des models a confronter. Plus precisement, I'impossibilite d'utilizer le facteur de Bayes avec les distributions a-priori standard non informative a effet de comparer les models, a conduit a proposer I'introduction de nouveaux criteres automatiques tels que le “posterior Bayes factor” (Aitkin 1991), les “intrinsic Bayes factors” (Berger et Pericchi 1996b) et le “fractional Bayes factor” (O'Hagan 1995). Dans cet article, on fait I'analyse de certaines proprietes jugees d'interět du “fractional Bayes factor” qui donnent d'ulterieures justifications a son utilisation en sus de celle deja donnees par O'Hagan. On peut aussi argumenter que I'utilisation du “fractional Bayes factor”, a I'origine introduit pour faire face aux distributions non informative, est aussi utile dans I'analyse robuste. En conclusion, utilisant les classes usuelles de distributions a-priori, nous procedons a la comparaison des differents facteurs de Bayes alternatifs, pour ce qui est le probleme de verifier I'hypothese nulle ponctuele dans le model usuel normal et unidimentional.

...read moreread less

Journal Article•10.1002/(SICI)1099-095X(199703)8:2<107::AID-ENV241>3.0.CO;2-E•

Hierarchical models for mapping Ohio lung cancer rates

[...]

Hong Xia¹, Bradley P. Carlin¹, Lance A. Waller¹•Institutions (1)

University of Minnesota¹

01 Mar 1997-Environmetrics

TL;DR: This study uses a Bayesian hierarchical modelling approach that uses a Markov chain Monte Carlo computational method to obtain the joint posterior distribution of the model parameters, and uncovers evidence of changing spatial structure in the rates over the 21-year period 1968-1988, suggesting a spatio-temporal hierarchical model as a new possibility.

...read moreread less

Abstract: The mapping of geographical variation in disease occurrence plays an important role in assessing environmental justice (i.e. the equitable sharing of adverse effects of pollution across socio-demographic subpopulations). Bayes and empirical Bayes methods can be used to obtain stable small-area estimates while retaining geographic and demographic resolution. In this study, we focus on modelling spatial patterns of disease rates, incorporating demographic variables of interest such as gender and race. We employ a Bayesian hierarchical modelling approach, which uses a Markov chain Monte Carlo computational method to obtain the joint posterior distribution of the model parameters. We use this approach to construct smoothed maps of lung cancer mortality in Ohio counties in 1988. Our approach also facilitates a cross-validatory comparison between the normal and Poisson likelihoods often fit uncritically to data of this type. Finally, we uncover evidence of changing spatial structure in the rates over the 21-year period 1968-1988, suggesting a spatio-temporal hierarchical model as a new possibility.

...read moreread less

Proceedings Article•10.1109/CVPR.1997.609376•

Empirical Bayesian EM-based motion segmentation

[...]

Nuno Vasconcelos¹, Andrew Lippman¹•Institutions (1)

Massachusetts Institute of Technology¹

17 Jun 1997

TL;DR: The authors exploit the fact that the EM framework is itself suited for empirical Bayesian data analysis to develop an algorithm that finds the estimates of the prior parameters which best explain the observed data.

...read moreread less

Abstract: A recent trend in motion-based segmentation has been to rely on statistical procedures derived from expectation-maximization (EM) principles. EM-based approaches have various advantages for segmentation, such as proceeding by taking non-greedy soft decisions regarding the assignment of pixels to regions, or allowing the use of sophisticated priors capable of imposing spatial coherence on the segmentation. A practical difficulty with such priors is, however the determination of appropriate values for their parameters. The authors exploit the fact that the EM framework is itself suited for empirical Bayesian data analysis to develop an algorithm that finds the estimates of the prior parameters which best explain the observed data. Such an approach maintains the Bayesian appeal of incorporating prior beliefs, but requires only a qualitative description of the prior avoiding the requirement of a quantitative specification of its parameters. This eliminates the need for trial-and-error strategies for parameter determination and leads to better segmentation with fewer iterations.

...read moreread less

Proceedings Article•

Robustness analysis of Bayesian networks with local convex sets of distributions

[...]

Fabio Gagliardi Cozman¹•Institutions (1)

Carnegie Mellon University¹

1 Aug 1997

TL;DR: This paper focuses on perturbations that can be expressed locally in Bayesian networks through convex sets of distributions, and discusses calculation of bounds for expected utilities and variances, and global perturbation models.

...read moreread less

Abstract: Robust Bayesian inference is the calculation of posterior probability bounds given perturbations in a probabilistic model. This paper focuses on perturbations that can be expressed locally in Bayesian networks through convex sets of distributions. Two approaches for combination of local models are considered. The first approach takes the largest set of joint distributions that is compatible with the local sets of distributions; we show how to reduce this type of robust inference to a linear programming problem. The second approach takes the convex hull of joint distributions generated from the local sets of distributions; we demonstrate how to apply interior-point optimization methods to generate posterior bounds and how to generate approximations that are guaranteed to converge to correct posterior bounds. We also discuss calculation of bounds for expected utilities and variances, and global perturbation models.

...read moreread less

Journal Article•10.1016/S0951-8320(96)00131-7•

An imprecise Dirichlet model for Bayesian analysis of failure data including right-censored observations

[...]

Frank P. A. Coolen¹•Institutions (1)

Durham University¹

01 Apr 1997-Reliability Engineering & System Safety

TL;DR: The model consists of a multinomial distribution with Dirichlet priors, making the approach basically nonparametric, and it fits into the robust Bayesian context which has the advantage that all inferences can be based on probabilities or expectancies, or bounds for probabilities or expectations.

...read moreread less

...

Expand