Bayesian News Feeds

repulsive mixtures

Xian's Og - Sun, 2017-04-09 18:17

Fangzheng Xie and Yanxun Xu arXived today a paper on Bayesian repulsive modelling for mixtures. Not that Bayesian modelling is repulsive in any psychological sense, but rather that the components of the mixture are repulsive one against another. The device towards this repulsiveness is to add a penalty term to the original prior such that close means are penalised. (In the spirit of the sugar loaf with water drops represented on the cover of Bayesian Choice that we used in our pinball sampler, repulsiveness being there on the particles of a simulated sample and not on components.) Which means a prior assumption that close covariance matrices are of lesser importance. An interrogation I have has is was why empty components are not excluded as well, but this does not make too much sense in the Dirichlet process formulation of the current paper. And in the finite mixture version the Dirichlet prior on the weights has coefficients less than one.

The paper establishes consistency results for such repulsive priors, both for estimating the distribution itself and the number of components, K, under a collection of assumptions on the distribution, prior, and repulsiveness factors. While I have no mathematical issue with such results, I always wonder at their relevance for a given finite sample from a finite mixture in that they give an impression that the number of components is a perfectly estimable quantity, which it is not (in my opinion!) because of the fluid nature of mixture components and therefore the inevitable impact of prior modelling. (As Larry Wasserman would pound in, mixtures like tequila are evil and should likewise be avoided!)

The implementation of this modelling goes through a “block-collapsed” Gibbs sampler that exploits the latent variable representation (as in our early mixture paper with Jean Diebolt). Which includes the Old Faithful data as an illustration (for which a submission of ours was recently rejected for using too old datasets). And use the logarithm of the conditional predictive ordinate as  an assessment tool, which is a posterior predictive estimated by MCMC, using the data a second time for the fit.

Filed under: Books, Statistics Tagged: consistency, Dirichlet mixture priors, finite mixtures, Gibbs sampling, Larry Wasserman, repulsiveness, reversible jump MCMC, tequila, unknown number of components
Categories: Bayesian Bloggers

challenged books

Xian's Og - Sat, 2017-04-08 18:17

After reading that Margaret Atwood’s The Handmaid’s Tale was one of the most challenged books in the USA, where challenged means “documented requests to remove materials from school or libraries”, I went to check on the website of the American Library Association for other titles, and found that The Curious Incident of the Dog in the Nigh-time and the Bible made it to the top 10 in 2015, with Of Mice and Men, Harry Potter, The Adventures of Huckleberry Finn, Brave New World, Hunger Games, Slaughterhouse Five, Cal, several of Roald Dahl’s and of Toni Morrisson’s books, Persepolis, and Tintin in America [and numerous others] appearing in the list… (As read in several comments, it is quite a surprise Shakespeare is not part of it!)

What is most frightening about those challenges and calls for censorship is that a growing portion of the reasons given against the books is “diversity“, namely that they propose a different view point, were it religious (or atheist), gender-related, ethnic, political, or disability-related.

Filed under: Books, Kids Tagged: American Library Association, banned books, Brave New World, censorship, Harry Potter, John Steinbeck, Persepolis, The Handmaid's Tale, Tintin, USA
Categories: Bayesian Bloggers

Shadows of Self [book review]

Xian's Og - Fri, 2017-04-07 18:17

“He’d always found it odd that so many died when they were old, as logic said that was the point in their lives when they’d the most practice not dying.”

Now this is steampunk fantasy, definitely! With little novelty in the setting of the universe. If mixed with a Wild West feeling, though, just like the half-made World

“Mirabell had been a statistician and psychologist in the third century who had studied why some people worked harder than others.”

Actually, this is the same universe as The Mistborn trilogy, but 300 years later,which allows for some self-referential jokes and satire. Including the notion that the current ruling class could be exactly what the heroes of The Mistborn had fought against!

“Not guns,” Wayne said with a grin. “A different kind of weapon. Math.”

More precisely, this is the (a?) sequel to the Alloy of Law, which I had almost completely forgotten, unlike The Mistborn trilogy, which does not help with the reading as the book refers rather insistently to this Alloy of Law!

“Sir, you said you hired me in part because of my ability to read statistics.”

Nonetheless, it is an interesting plot, with a very nice ambiguity of the main characters, who (again) often feel they may be closer to the dictature that set The Mistborn revolution than to the revolutionaries themselves! And one of the heroes is a statistician (as obvious from the many quotes around!).

“Wayne felt a disturbance stir within him, like his stomach discovering  he’d just fed it a bunch of rotten apples. Religion worried him. It could ask men to do things they’d otherwise never do.”

In short, good story, nice style, entertaining dialogues: perfect [mind-candy] travel novel!

Filed under: Books, Kids, Statistics, Travel Tagged: alloy, Brandon Sanderson, candy, Mistborn, Shadows of Self, Statistics, steampunk, Wild West
Categories: Bayesian Bloggers

and it only gets worse…

Xian's Og - Fri, 2017-04-07 08:18

“The State Department said on Monday it was ending U.S. funding for the United Nations Population Fund, the international body’s agency focused on family planning as well as maternal and child health in more than 150 countries.” Reuters, April 3, 2017

“When it comes to science, there are few winners in US President Donald Trump’s first budget proposal. The plan, released on 16 March, calls for double-digit cuts for the Environmental Protection Agency (EPA) and the National Institutes of Health (NIH). It also lays the foundation for a broad shift in the United States’ research priorities, including a retreat from environmental and climate programmes.” Nature, March 16, 2017

“In light of the recent executive order on visas and immigration, we are compelled to speak out in support of our international members. Science benefits from the free expression and exchange of ideas. As the oldest scientific society in the United States, and the world’s largest professional society for statisticians, the ASA has an overarching responsibility to support rigorous and robust science. Our world relies on data and statistical thinking to drive discovery, which thrives from the contributions of a global community of scientists, researchers, and students. A flourishing scientific culture, in turn, benefits our nation’s economic prosperity and security. ​” ASA, March, 2017

Filed under: Kids, pictures, Travel Tagged: American Statistical Association, climate change, Donald Trump, Environmental Protection Agency, global warming, Human Rights, National Institutes of Health, The New York Times, trumpism, United Nations Population Fund, US politics

Categories: Bayesian Bloggers

Bayesian program synthesis

Xian's Og - Thu, 2017-04-06 18:17

Last week, I—along with Jean-Michel Marin—got an email from a journalist working for Science & Vie, a French sciences journal that published a few years ago a special issue on Bayes’ theorem. (With the insane title of “the formula that deciphers the World!”) The reason for this call was the preparation of a paper on Gamalon, a new AI company that relies on (Bayesian) probabilistic programming to devise predictive tools. And spent an hour skyping with him about Bayesian inference, probabilistic programming and machine-learning, at the general level since we had not heard previously of this company or of its central tool.

“the Gamalon BPS system learns from only a few examples, not millions. It can learn using a tablet processor, not hundreds of servers. It learns right away while we play with it, not over weeks or months. And it learns from just one person, not from thousands.”

Gamalon claims to do much better than deep learning at those tasks. Not that I have reasons to doubt that claim, quite the opposite, an obvious reason being that incorporating rules and probabilistic models in the predictor is going to help if these rule and models are even moderately realistic, another major one being that handling uncertainty and learning by Bayesian tools is usually a good idea (!), and yet another significant one being that David Blei is a member of their advisory committee. But it is hard to get a feeling for such claims when the only element in the open is the use of probabilistic programming, which is an advanced and efficient manner of conducting model building and updating and handling (posterior) distributions as objects, but which does not enjoy higher predictives abilities by default. Unless I live with a restricted definition of what probabilistic programming stands for! In any case, the video provided by Gamalon and the presentation given by its CEO do not help in my understanding of the principles behind this massive gain in efficiency. Which makes sense given that the company would not want to give up their edge on the competition.

Incidentally, the video in this presentation comparing the predictive abilities of the four major astronomical explanations of the solar system is great. If not particularly connected with the difference between deep learning and Bayesian probabilistic programming.

Filed under: Books, pictures, Statistics, University life Tagged: David Blei, deep learning, Gamelon, machine learning, neural network, principles of uncertainty, probabilistic programming, Science & Vie, solar system

Categories: Bayesian Bloggers

Statlearn17, Lyon

Xian's Og - Thu, 2017-04-06 08:18

Today and tomorrow, I am attending the Statlearn17 conference in Lyon, France. Which is a workshop with one-hour talks on statistics and machine learning. And which makes for the second workshop on machine learning in two weeks! Yesterday there were two tutorials in R, but I only took the train to Lyon this morning: it will be a pleasant opportunity to run tomorrow through a city I have not truly ever visited, if X’ed so many times driving to the Alps. Interestingly, the trip started in Paris with me sitting in the train next to another speaker at the conference, despite having switched seat and carriage with another passenger! Speaker whom I did not know beforehand and could only identify him by his running R codes at 300km/h.

Filed under: Kids, pictures, R, Statistics, Travel, University life Tagged: Berlin, conference, France, French Alps, Lyon, machine learning, R, SFDS, Statlearn 2017, train, Université Lumière Lyon 2
Categories: Bayesian Bloggers

the incomprehensible challenge of poker

Xian's Og - Wed, 2017-04-05 18:17

When reading in Nature about two deep learning algorithms winning at a version of poker within a few weeks of difference, I came back to my “usual” wonder about poker, as I cannot understand it as a game. (Although I can see the point, albeit dubious, in playing to win money.) And [definitely] correlatively do not understand the difficulty in building an AI that plays the game. [I know, I know nothing!]

Filed under: Statistics Tagged: /Pages/SIMAccueil.aspx, artificial intelligence, bills, deep learning, Denmark, game theory, Nature, poker, statistics and sports
Categories: Bayesian Bloggers

objective and subjective RSS Read Paper next week

Xian's Og - Wed, 2017-04-05 08:18

Andrew Gelman and Christian Hennig will give a Read Paper presentation next Wednesday, April 12, 5pm, at the Royal Statistical Society, London, on their paper “Beyond subjective and objective in statistics“. Which I hope to attend and else to write a discussion. Since the discussion (to published in Series A) is open to everyone, I strongly encourage ‘Og’s readers to take a look at the paper and the “radical” views therein to hopefully contribute to this discussion. Either as a written discussion or as comments on this very post.

Filed under: Books, pictures, Statistics, Travel, University life, Wines Tagged: Andrew Gelman, Christian Hennig, discussion paper, England, frequentist inference, London, objective Bayes, objectivism, Philosophy of Science, Read paper, Royal Statistical Society, RSS, Series A, subjective versus objective Bayes, subjectivity
Categories: Bayesian Bloggers

Bayesian Estimation of Principal Components for Functional Data

Adam J. Suarez, Subhashis Ghosal.

Source: Bayesian Analysis, Volume 12, Number 2, 311--333.

The area of principal components analysis (PCA) has seen relatively few contributions from the Bayesian school of inference. In this paper, we propose a Bayesian method for PCA in the case of functional data observed with error. We suggest modeling the covariance function by use of an approximate spectral decomposition, leading to easily interpretable parameters. We perform model selection, both over the number of principal components and the number of basis functions used in the approximation. We study in depth the choice of using the implied distributions arising from the inverse Wishart prior and prove a convergence theorem for the case of an exact finite dimensional representation. We also discuss computational issues as well as the care needed in choosing hyperparameters. A simulation study is used to demonstrate competitive performance against a recent frequentist procedure, particularly in terms of the principal component estimation. Finally, we apply the method to a real dataset, where we also incorporate model selection on the dimension of the finite basis used for modeling.

Categories: Bayesian Analysis

Bayesian Functional Data Modeling for Heterogeneous Volatility

Bin Zhu, David B. Dunson.

Source: Bayesian Analysis, Volume 12, Number 2, 335--350.

Although there are many methods for functional data analysis, less emphasis is put on characterizing variability among volatilities of individual functions. In particular, certain individuals exhibit erratic swings in their trajectory while other individuals have more stable trajectories. There is evidence of such volatility heterogeneity in blood pressure trajectories during pregnancy, for example, and reason to suspect that volatility is a biologically important feature. Most functional data analysis models implicitly assume similar or identical smoothness of the individual functions, and hence can lead to misleading inferences on volatility and an inadequate representation of the functions. We propose a novel class of functional data analysis models characterized using hierarchical stochastic differential equations. We model the derivatives of a mean function and deviation functions using Gaussian processes, while also allowing covariate dependence including on the volatilities of the deviation functions. Following a Bayesian approach to inference, a Markov chain Monte Carlo algorithm is used for posterior computation. The methods are tested on simulated data and applied to blood pressure trajectories during pregnancy.

Categories: Bayesian Analysis

Latent Space Approaches to Community Detection in Dynamic Networks

Daniel K. Sewell, Yuguo Chen.

Source: Bayesian Analysis, Volume 12, Number 2, 351--377.

Embedding dyadic data into a latent space has long been a popular approach to modeling networks of all kinds. While clustering has been done using this approach for static networks, this paper gives two methods of community detection within dynamic network data, building upon the distance and projection models previously proposed in the literature. Our proposed approaches capture the time-varying aspect of the data, can model directed or undirected edges, inherently incorporate transitivity and account for each actor’s individual propensity to form edges. We provide Bayesian estimation algorithms, and apply these methods to a ranked dynamic friendship network and world export/import data.

Categories: Bayesian Analysis

Dependent Species Sampling Models for Spatial Density Estimation

Seongil Jo, Jaeyong Lee, Peter Müller, Fernando A. Quintana, Lorenzo Trippa.

Source: Bayesian Analysis, Volume 12, Number 2, 379--406.

We consider a novel Bayesian nonparametric model for density estimation with an underlying spatial structure. The model is built on a class of species sampling models, which are discrete random probability measures that can be represented as a mixture of random support points and random weights. Specifically, we construct a collection of spatially dependent species sampling models and propose a mixture model based on this collection. The key idea is the introduction of spatial dependence by modeling the weights through a conditional autoregressive model. We present an extensive simulation study to compare the performance of the proposed model with competitors. The proposed model compares favorably to these alternatives. We apply the method to the estimation of summer precipitation density functions using Climate Prediction Center Merged Analysis of Precipitation data over East Asia.

Categories: Bayesian Analysis

A Hierarchical Bayesian Setting for an Inverse Problem in Linear Parabolic PDEs with Noisy Boundary Conditions

Fabrizio Ruggeri, Zaid Sawlan, Marco Scavino, Raul Tempone.

Source: Bayesian Analysis, Volume 12, Number 2, 407--433.

In this work we develop a Bayesian setting to infer unknown parameters in initial-boundary value problems related to linear parabolic partial differential equations. We realistically assume that the boundary data are noisy, for a given prescribed initial condition. We show how to derive the joint likelihood function for the forward problem, given some measurements of the solution field subject to Gaussian noise. Given Gaussian priors for the time-dependent Dirichlet boundary values, we analytically marginalize the joint likelihood using the linearity of the equation. Our hierarchical Bayesian approach is fully implemented in an example that involves the heat equation. In this example, the thermal diffusivity is the unknown parameter. We assume that the thermal diffusivity parameter can be modeled a priori through a lognormal random variable or by means of a space-dependent stationary lognormal random field. Synthetic data are used to test the inference. We exploit the behavior of the non-normalized log posterior distribution of the thermal diffusivity. Then, we use the Laplace method to obtain an approximated Gaussian posterior and therefore avoid costly Markov Chain Monte Carlo computations. Expected information gains and predictive posterior densities for observable quantities are numerically estimated using Laplace approximation for different experimental setups.

Categories: Bayesian Analysis

Bayesian Inference for Diffusion-Driven Mixed-Effects Models

Gavin A. Whitaker, Andrew Golightly, Richard J. Boys, Chris Sherlock.

Source: Bayesian Analysis, Volume 12, Number 2, 435--463.

Stochastic differential equations (SDEs) provide a natural framework for modelling intrinsic stochasticity inherent in many continuous-time physical processes. When such processes are observed in multiple individuals or experimental units, SDE driven mixed-effects models allow the quantification of both between and within individual variation. Performing Bayesian inference for such models using discrete-time data that may be incomplete and subject to measurement error is a challenging problem and is the focus of this paper. We extend a recently proposed MCMC scheme to include the SDE driven mixed-effects framework. Fundamental to our approach is the development of a novel construct that allows for efficient sampling of conditioned SDEs that may exhibit nonlinear dynamics between observation times. We apply the resulting scheme to synthetic data generated from a simple SDE model of orange tree growth, and real data on aphid numbers recorded under a variety of different treatment regimes. In addition, we provide a systematic comparison of our approach with an inference scheme based on a tractable approximation of the SDE, that is, the linear noise approximation.

Categories: Bayesian Analysis

Automated Parameter Blocking for Efficient Markov Chain Monte Carlo Sampling

Daniel Turek, Perry de Valpine, Christopher J. Paciorek, Clifford Anderson-Bergman.

Source: Bayesian Analysis, Volume 12, Number 2, 465--490.

Markov chain Monte Carlo (MCMC) sampling is an important and commonly used tool for the analysis of hierarchical models. Nevertheless, practitioners generally have two options for MCMC: utilize existing software that generates a black-box “one size fits all" algorithm, or the challenging (and time consuming) task of implementing a problem-specific MCMC algorithm. Either choice may result in inefficient sampling, and hence researchers have become accustomed to MCMC runtimes on the order of days (or longer) for large models. We propose an automated procedure to determine an efficient MCMC block-sampling algorithm for a given model and computing platform. Our procedure dynamically determines blocks of parameters for joint sampling that result in efficient MCMC sampling of the entire model. We test this procedure using a diverse suite of example models, and observe non-trivial improvements in MCMC efficiency for many models. Our procedure is the first attempt at such, and may be generalized to a broader space of MCMC algorithms. Our results suggest that substantive improvements in MCMC efficiency may be practically realized using our automated blocking procedure, or variants thereof, which warrants additional study and application.

Categories: Bayesian Analysis

Dynamic Chain Graph Models for Time Series Network Data

Osvaldo Anacleto, Catriona Queen.

Source: Bayesian Analysis, Volume 12, Number 2, 491--509.

This paper introduces a new class of Bayesian dynamic models for inference and forecasting in high-dimensional time series observed on networks. The new model, called the dynamic chain graph model, is suitable for multivariate time series which exhibit symmetries within subsets of series and a causal drive mechanism between these subsets. The model can accommodate high-dimensional, non-linear and non-normal time series and enables local and parallel computation by decomposing the multivariate problem into separate, simpler sub-problems of lower dimensions. The advantages of the new model are illustrated by forecasting traffic network flows and also modelling gene expression data from transcriptional networks.

Categories: Bayesian Analysis

Mixtures of $g$ -priors for analysis of variance models with a diverging number of parameters

Min Wang.

Source: Bayesian Analysis, Volume 12, Number 2, 511--532.

We consider Bayesian approaches for the hypothesis testing problem in the analysis-of-variance (ANOVA) models. With the aid of the singular value decomposition of the centered designed matrix, we reparameterize the ANOVA models with linear constraints for uniqueness into a standard linear regression model without any constraint. We derive the Bayes factors based on mixtures of $g$ -priors and study their consistency properties with a growing number of parameters. It is shown that two commonly used hyper-priors on $g$ (the Zellner-Siow prior and the beta-prime prior) yield inconsistent Bayes factors due to the presence of an inconsistency region around the null model. We propose a new class of hyper-priors to avoid this inconsistency problem. Simulation studies on the two-way ANOVA models are conducted to compare the performance of the proposed procedures with that of some existing ones in the literature.

Categories: Bayesian Analysis

Data-Dependent Posterior Propriety of a Bayesian Beta-Binomial-Logit Model

Hyungsuk Tak, Carl N. Morris.

Source: Bayesian Analysis, Volume 12, Number 2, 533--555.

A Beta-Binomial-Logit model is a Beta-Binomial model with covariate information incorporated via a logistic regression. Posterior propriety of a Bayesian Beta-Binomial-Logit model can be data-dependent for improper hyper-prior distributions. Various researchers in the literature have unknowingly used improper posterior distributions or have given incorrect statements about posterior propriety because checking posterior propriety can be challenging due to the complicated functional form of a Beta-Binomial-Logit model. We derive data-dependent necessary and sufficient conditions for posterior propriety within a class of hyper-prior distributions that encompass those used in previous studies. When a posterior is improper due to improper hyper-prior distributions, we suggest using proper hyper-prior distributions that can mimic the behaviors of improper choices.

Categories: Bayesian Analysis