Bayesian News Feeds
Source: Bayesian Analysis, Volume 9, Number 3, 659--684.
We consider the behavior of Bayesian procedures that perform model selection for decomposable Gaussian graphical models when the true model is in fact non-decomposable. We examine the asymptotic behavior of the posterior when models are misspecified in this way, and find that the posterior will converge to graphical structures that are minimal triangulations of the true structure. The marginal log likelihood ratio comparing different minimal triangulations is stochastically bounded, and appears to remain data dependent regardless of the sample size. The covariance matrices corresponding to the different minimal triangulations are essentially equivalent, so model averaging is of minimal benefit. Using simulated data sets and a particular high performing Bayesian method for fitting decomposable models, feature inclusion stochastic search, we illustrate that these predictions are borne out in practice. Finally, a comparison is made to penalized likelihood methods for graphical models, which make no decomposability restriction. Despite its inability to fit the true model, feature inclusion stochastic search produces models that are competitive or superior to the penalized likelihood methods, especially at higher dimensions.
Source: Bayesian Analysis, Volume 9, Number 3, 685--698.
Bayesian decision theory is profoundly personalistic. It prescribes the decision $d$ that minimizes the expectation of the decision-maker’s loss function $L(d,\theta)$ with respect to that person’s opinion $\pi(\theta)$ . Attempts to extend this paradigm to more than one decision-maker have generally been unsuccessful, as shown in Part A of this paper. Part B of this paper explores a different decision set-up, in which Bayesians make choices knowing that later Bayesians will make decisions that matter to the earlier Bayesians. We explore conditions under which they together can be modeled as a single Bayesian. There are three reasons for doing so: 1. To understand the common structure of various examples, in some of which the reduction to a single Bayesian is possible, and in some of which it is not. In particular, it helps to deepen our understanding of the desirability of randomization to Bayesians. 2. As a possible computational simplification. When such reduction is possible, standard expected loss minimization software can be used to find optimal actions. 3. As a start toward a better understanding of social decision-making.
Spatial Bayesian Variable Selection Models on Functional Magnetic Resonance Imaging Time-Series Data
Source: Bayesian Analysis, Volume 9, Number 3, 699--732.
A common objective of fMRI (functional magnetic resonance imaging) studies is to determine subject-specific areas of increased blood oxygenation level dependent (BOLD) signal contrast in response to a stimulus or task, and hence to infer regional neuronal activity. We posit and investigate a Bayesian approach that incorporates spatial and temporal dependence and allows for the task-related change in the BOLD signal to change dynamically over the scanning session. In this way, our model accounts for potential learning effects in addition to other mechanisms of temporal drift in task-related signals. We study the properties of the model through its performance on simulated and real data sets.
Source: Bayesian Analysis, Volume 9, Number 3, 733--758.
We propose a multiscale model for Gaussian noised images under a Bayesian framework for both 2-dimensional (2D) and 3-dimensional (3D) images. We use a Chinese restaurant process prior to randomly generate ties among intensity values at neighboring pixels in the image. The resulting Bayesian estimator enjoys some desirable asymptotic properties for identifying precise structures in the image. The proposed Bayesian denoising procedure is completely data-driven. A conditional conjugacy property allows analytical computation of the posterior distribution without involving Markov chain Monte Carlo (MCMC) methods, making the method computationally efficient. Simulations on Shepp-Logan phantom and Lena test images confirm that our smoothing method is comparable with the best available methods for light noise and outperforms them for heavier noise both visually and numerically. The proposed method is further extended for 3D images. A simulation study shows that the proposed method is numerically better than most existing denoising approaches for 3D images. A 3D Shepp-Logan phantom image is used to demonstrate the visual and numerical performance of the proposed method, along with the computational time. MATLAB toolboxes are made available online (both 2D and 3D) to implement the proposed method and reproduce the numerical results.
Michael Finegold, Mathias Drton. Robust Bayesian Graphical Modeling Using Dirichlet $t$ -Distributions. 521--550.
François Caron, Luke Bornn. Comment on Article by Finegold and Drton. 551--556.
Babak Shahbaba. Comment on Article by Finegold and Drton. 557--560.
Various authors. Contributed Discussion on Article by Finegold and Drton. 561--590.
Michael Finegold, Mathias Drton. Rejoinder. 591--596.
Timothy E. Hanson, Adam J. Branscum, Wesley O. Johnson. Informative $g$ -Priors for Logistic Regression. 597--612.
George Casella, Elías Moreno, F. Javier Girón. Cluster Analysis, Model Selection, and Prior Distributions on Models. 613--658.
A. Marie Fitch, M. Beatrix Jones, Hélène Massam. The Performance of Covariance Selection Methods That Consider Decomposable Models Only. 659--684.
Joseph B. Kadane, Steven N. MacEachern. Toward Rational Social Decisions: A Review and Some Results. 685--698.
Kuo-Jung Lee, Galin L. Jones, Brian S. Caffo, Susan S. Bassett. Spatial Bayesian Variable Selection Models on Functional Magnetic Resonance Imaging Time-Series Data. 699--732.
Meng Li, Subhashis Ghosal. Bayesian Multiscale Smoothing of Gaussian Noised Images. 733--758.
The September issue of [JRSS] Series B I received a few days ago is of particular interest to me. (And not as an ex-co-editor since I was never involved in any of those papers!) To wit: a paper by Hani Doss and Aixin Tan on evaluating normalising constants based on MCMC output, a preliminary version I had seen at a previous JSM meeting, a paper by Nick Polson, James Scott and Jesse Windle on the Bayesian bridge, connected with Nick’s talk in Boston earlier this month, yet another paper by Ariel Kleiner, Ameet Talwalkar, Purnamrita Sarkar and Michael Jordan on the bag of little bootstraps, which presentation I heard Michael deliver a few times when he was in Paris. (Obviously, this does not imply any negative judgement on the other papers of this issue!)
For instance, Doss and Tan consider the multiple mixture estimator [my wording, the authors do not give the method a name, referring to Vardi (1985) but missing the connection with Owen and Zhou (2000)] of k ratios of normalising constants, namely
where the z’s are the normalising constants and with possible different numbers of iterations of each Markov chain. An interesting starting point (that Hans Künsch had mentioned to me a while ago but that I had since then forgotten) is that the problem was reformulated by Charlie Geyer (1994) as a quasi-likelihood estimation where the ratios of all z’s relative to one reference density are the unknowns. This is doubling interesting, actually, because it restates the constant estimation problem into a statistical light and thus somewhat relates to the infamous “paradox” raised by Larry Wasserman a while ago. The novelty in the paper is (a) to derive an optimal estimator of the ratios of normalising constants in the Markov case, essentially accounting for possibly different lengths of the Markov chains, and (b) to estimate the variance matrix of the ratio estimate by regeneration arguments. A favourite tool of mine, at least theoretically as practically useful minorising conditions are hard to come by, if at all available.
Filed under: Books, Statistics, Travel, University life Tagged: bag of little bootstraps, Bayesian bridge, Bayesian lasso, JRSSB, marginal likelihood, Markov chain Monte Carlo, normalising constant, Series B, simulation, untractable normalizing constant, Wasserman's paradox
Yet another workshop around! Still at Warwick, organised by Simon Barthelmé, Nicolas Chopin and Adam Johansen on the theme of statistical aspects of neuroscience. Being nearby I attended a few lectures today but most talks are more topical than my current interest in the matter, plus workshop fatigue starts to appear!, and hence I will keep a low attendance for the rest of the week to take advantage of my visit here to make some progress in my research and in the preparation of the teaching semester. (Maybe paradoxically I attended a non-neuroscience talk by listening to Richard Wilkinson’s coverage of ABC methods, with an interesting stress on meta-models and the link with computer experiments. Given that we are currently re-revising our paper with Matt Moore and Kerrie Mengersen (and now Chris Drovandi), I find interesting to see a sort of convergence in our community towards a re-re-interpretation of ABC as producing an approximation of the distribution of the summary statistic itself, rather than of the original data, using auxiliary or indirect or pseudo-models like Gaussian processes. (Making the link with Mark Girolami’s talk this morning.)
Filed under: Books, pictures, Statistics, Travel Tagged: ABC, computer experiment model, Gaussian processes, indirect inference, neurosciences, University of Warwick, workshop
Filed under: pictures, Travel, University life Tagged: England, heron, mathematics, Statistics, summer, University of Warwick
Great poster session yesterday night and at lunch today. Saw an ABC poster (by Dennis Prangle, following our random forest paper) and several MCMC posters (by Marco Banterle, who actually won one of the speed-meeting mini-project awards!, Michael Betancourt, Anne-Marie Lyne, Murray Pollock), and then a rather different poster on Mondrian forests, that generalise random forests to sequential data (by Balaji Lakshminarayanan). The talks all had interesting aspects or glimpses about big data and some of the unnecessary hype about it (them?!), along with exposing the nefarious views of Amazon to become the Earth only seller!, but I particularly enjoyed the astronomy afternoon and even more particularly Steve Roberts sweep through astronomy machine-learning. Steve characterised variational Bayes as picking your choice of sufficient statistics, which made me wonder why there were no stronger connections between variational Bayes and ABC. He also quoted the book The Fourth Paradigm: Data-Intensive Scientific Discovery by Tony Hey as putting forward interesting notions. (A book review for the next vacations?!) And also mentioned zooniverse, a citizens science website I was not aware of. With a Bayesian analysis of the learning curve of those annotating citizens (in the case of supernovae classification). Big deal, indeed!!!
Filed under: Books, Kids, pictures, Statistics, Travel, University life Tagged: ABC, Amazon, astronomy, astrostatistics, big data, conference, England, galaxies, pulsars, Statistics, supernovae, The Fourth Paradigm, The Large Synoptic Survey Telescope, University of Warwick, variational Bayes methods, workshop