## Bayesian Bloggers

### this issue of Series B

**T**he September issue of [JRSS] Series B I received a few days ago is of particular interest to me. (And not as an ex-co-editor since I was never involved in any of those papers!) To wit: a paper by Hani Doss and Aixin Tan on evaluating normalising constants based on MCMC output, a preliminary version I had seen at a previous JSM meeting, a paper by Nick Polson, James Scott and Jesse Windle on the Bayesian bridge, connected with Nick’s talk in Boston earlier this month, yet another paper by Ariel Kleiner, Ameet Talwalkar, Purnamrita Sarkar and Michael Jordan on the bag of little bootstraps, which presentation I heard Michael deliver a few times when he was in Paris. (Obviously, this does not imply any negative judgement on the other papers of this issue!)

For instance, Doss and Tan consider the multiple mixture estimator [my wording, the authors do not give the method a name, referring to Vardi (1985) but missing the connection with Owen and Zhou (2000)] of k ratios of normalising constants, namely

where the z’s are the normalising constants and with possible different numbers of iterations of each Markov chain. An interesting starting point (that Hans Künsch had mentioned to me a while ago but that I had since then forgotten) is that the problem was reformulated by Charlie Geyer (1994) as a quasi-likelihood estimation where the ratios of all z’s relative to one reference density are the unknowns. This is doubling interesting, actually, because it restates the constant estimation problem into a statistical light and thus somewhat relates to the infamous “paradox” raised by Larry Wasserman a while ago. The novelty in the paper is (a) to derive an optimal estimator of the ratios of normalising constants in the Markov case, essentially accounting for possibly different lengths of the Markov chains, and (b) to estimate the variance matrix of the ratio estimate by regeneration arguments. A favourite tool of mine, at least theoretically as practically useful minorising conditions are hard to come by, if at all available.

Filed under: Books, Statistics, Travel, University life Tagged: bag of little bootstraps, Bayesian bridge, Bayesian lasso, JRSSB, marginal likelihood, Markov chain Monte Carlo, normalising constant, Series B, simulation, untractable normalizing constant, Wasserman's paradox

### statistical challenges in neuroscience

**Y**et another workshop around! Still at Warwick, organised by Simon Barthelmé, Nicolas Chopin and Adam Johansen on the theme of statistical aspects of neuroscience. Being nearby I attended a few lectures today but most talks are more topical than my current interest in the matter, plus workshop fatigue starts to appear!, and hence I will keep a low attendance for the rest of the week to take advantage of my visit here to make some progress in my research and in the preparation of the teaching semester. (Maybe paradoxically I attended a non-neuroscience talk by listening to Richard Wilkinson’s coverage of ABC methods, with an interesting stress on meta-models and the link with computer experiments. Given that we are currently re-revising our paper with Matt Moore and Kerrie Mengersen (and now Chris Drovandi), I find interesting to see a sort of convergence in our community towards a re-re-interpretation of ABC as producing an approximation of the distribution of the summary statistic itself, rather than of the original data, using auxiliary or indirect or pseudo-models like Gaussian processes. (Making the link with Mark Girolami’s talk this morning.)

Filed under: Books, pictures, Statistics, Travel Tagged: ABC, computer experiment model, Gaussian processes, indirect inference, neurosciences, University of Warwick, workshop

### Warwick campus

Filed under: pictures, Travel, University life Tagged: England, heron, mathematics, Statistics, summer, University of Warwick

### big data, big models, it is a big deal! [posters & talks]

**G**reat poster session yesterday night and at lunch today. Saw an ABC poster (by Dennis Prangle, following our random forest paper) and several MCMC posters (by Marco Banterle, who actually won one of the speed-meeting mini-project awards!, Michael Betancourt, Anne-Marie Lyne, Murray Pollock), and then a rather different poster on Mondrian forests, that generalise random forests to sequential data (by Balaji Lakshminarayanan). The talks all had interesting aspects or glimpses about big data and some of the unnecessary hype about it (them?!), along with exposing the nefarious views of Amazon to become the Earth only seller!, but I particularly enjoyed the astronomy afternoon and even more particularly Steve Roberts sweep through astronomy machine-learning. Steve characterised variational Bayes as picking your choice of sufficient statistics, which made me wonder why there were no stronger connections between variational Bayes and ABC. He also quoted the book The Fourth Paradigm: Data-Intensive Scientific Discovery by Tony Hey as putting forward interesting notions. (A book review for the next vacations?!) And also mentioned zooniverse, a citizens science website I was not aware of. With a Bayesian analysis of the learning curve of those annotating citizens (in the case of supernovae classification). Big deal, indeed!!!

Filed under: Books, Kids, pictures, Statistics, Travel, University life Tagged: ABC, Amazon, astronomy, astrostatistics, big data, conference, England, galaxies, pulsars, Statistics, supernovae, The Fourth Paradigm, The Large Synoptic Survey Telescope, University of Warwick, variational Bayes methods, workshop

### ISBA@NIPS

*[An announcement from ISBA about sponsoring young researchers at NIPS that links with my earlier post that our ABC in Montréal proposal for a workshop had been accepted and a more global feeling that we (as a society) should do more to reach towards machine-learning.]
*

**T**he International Society for Bayesian Analysis (ISBA) is pleased to announce its new initiative *ISBA@NIPS*, an initiative aimed at highlighting the importance and impact of Bayesian methods in the new era of data science.

Among the first actions of this initiative, ISBA is endorsing a number of *Bayesian satellite workshops* at the Neural Information Processing Systems (NIPS) Conference, that will be held in Montréal, Québec, Canada, December 8-13, 2014.

Furthermore, a special ISBA@NIPS Travel Award will be granted to the best Bayesian invited and contributed paper(s) among all the ISBA endorsed workshops.

ISBA endorsed workshops at NIPS

- ABC in Montréal. This workshop will include topics on: Applications of ABC to machine learning, e.g., computer vision, other inverse problems (RL); ABC Reinforcement Learning (other inverse problems); Machine learning models of simulations, e.g., NN models of simulation responses, GPs etc.; Selection of sufficient statistics and massive dimension reduction methods; Online and post-hoc error; ABC with very expensive simulations and acceleration methods (surrogate modelling, choice of design/simulation points).
- Networks: From Graphs to Rich Data. This workshop aims to bring together a diverse and cross-disciplinary set of researchers to discuss recent advances and future directions for developing new network methods in statistics and machine learning.
- Advances in Variational Inference. This workshop aims at highlighting recent advancements in variational methods, including new methods for scalability using stochastic gradient methods, , extensions to the streaming variational setting, improved local variational methods, inference in non-linear dynamical systems, principled regularisation in deep neural networks, and inference-based decision making in reinforcement learning, amongst others.
- Women in Machine Learning (WiML 2014). This is a day-long workshop that gives female faculty, research scientists, and graduate students in the machine learning community an opportunity to meet, exchange ideas and learn from each other. Under-represented minorities and undergraduates interested in machine learning research are encouraged to attend.

ISBA@NIPS Travel Award

The ISBA Program Council will grant two ISBA special Travel Award to two selected young participants, one in the category of *Invited Paper* and one in the category of *Contributed Paper*. Each Travel Award will be of at most 1000 USD. Organisers of ISBA endorsed workshops at NIPS are all invited to propose candidates.

**Eligibility**

- Only participants of ISBA-endorsed Workshops at NIPS will be considered.
- The recipients should be graduate students or junior researchers (up to five years after graduation) presenting at the workshop.
- The recipients should be ISBA members at the moment of receiving the award.

**Application procedure**

The organizers of ISBA-endorsed Workshops at NIPS who wish to apply, select one or two candidates and postulate them as candidates to the ISBA Program Council by no later than:

- September the 5th, 2014 (for the category of Invited Paper)
- October the 29th, 2014 (for the category of Contributed Paper)

The ISBA Program Council selects the two winners among the candidates proposed by all ISBA-endorsed Workshops. The outcome of the above procedure will be communicated to the Workshop Organisers by no later than:

- September the 9th, 2014 (for the category of Invited Paper)
- November the 7th, 2014 (for the category of Contributed Paper)

The winners will present a special ISBA@NIPS Travel Award recipient’s seminar at the workshops at NIPS.

Filed under: Statistics, Travel, University life Tagged: ABC in Montréal, Canada, graphical models, ISBA, machine learning, Montréal, NIPS 2014, Québec, travel award, variational Bayes methods

### big data, big models, it is a big deal!

Filed under: pictures, Statistics, University life Tagged: Amazon, big data, conference, England, Statistics, University of Warwick, workshop

### a day of travel

**I** had quite a special day today as I travelled through Birmingham, made a twenty minutes stop in Coventry to drop my bag in my office, went down to London to collect a most kindly loaned city-bike and took the train back to Coventry with the said bike… On my way from Bristol to Warwick, I decided to spend the night in downtown Birmingham as it was both easier and cheaper than to find accommodation on Warwick campus. However, while the studio I rented was well-designed and brand-new, my next door neighbours were not so well-designed in that I could hear them and the TV through the wall, despite top-quality ear-plugs! After a request of mine, they took the TV off but kept to the same decibel level for their uninteresting exchanges. In the morning I tried to go running in the centre of Birmingham but, as I could not find the canals, I quickly got bored and gave up. As Mark had proposed to lend me a city bike for my commuting in [and not to] Warwick, I then decided to take the opportunity of a free Sunday to travel down to London to pick the bike, change the pedals in a nearby shop, add an anti-theft device, and head back to Coventry. Which gave me the opportunity to bike in London by Abbey Road, Regent Park, and Hampstead, before [easily] boarding a fast train back to Coventry and biking up to the University of Warwick campus. (Sadly to discover that all convenience stores had closed by then… )

Filed under: pictures, Running, Travel Tagged: biking, Birmingham, Coventry, England, London, Regent Park, University of Warwick

### efficient exploration of multi-modal posterior distributions

**T**he title of this recent arXival had potential appeal, however the proposal ends up being rather straightforward and hence anti-climactic! The paper by Hu, Hendry and Heng proposes to run a mixture of proposals centred at the various modes of the target for an efficient exploration. This is a correct MCMC algorithm, granted!, but the requirement to know beforehand *all* the modes to be explored is self-defeating, since the major issue with MCMC is about modes that are omitted from the exploration and remain undetected throughout the simulation… As provided, this is a standard MCMC algorithm with no adaptive feature and I would rather suggest our population Monte Carlo version, given the available information. Another connection with population Monte Carlo is that I think the performances would improve by Rao-Blackwellising the acceptance rate, i.e. removing the conditioning on the (ancillary) component of the index. For PMC we proved that using the mixture proposal in the ratio led to an ideally minimal variance estimate and I do not see why randomising the acceptance ratio in the current case would bring any improvement.

Filed under: Books, Statistics, University life Tagged: acceptance probability, Metropolis-Hastings algorithms, multimodal target, population Monte Carlo, Rao-Blackwellisation

### Avernian posts

Filed under: Mountains, pictures, Running, Travel Tagged: agriculture, Auvergne, Besse-en-Chandesse, fields, France, hay, hiking, landscape, Murol, POST, summer

### high-dimensional stochastic simulation and optimisation in image processing [day #3]

**L**ast and maybe most exciting day of the “High-dimensional Stochastic Simulation and Optimisation in Image Processing” in Bristol as it was exclusively about simulation (MCMC) methods. Except my own talk on ABC. And Peter Green’s on consistency of Bayesian inference in non-regular models. The talks today were indeed about using convex optimisation devices to speed up MCMC algorithms with tools that were entirely new to me, like the Moreau transform discussed by Marcelo Pereyra. Or using auxiliary variables à la RJMCMC to bypass expensive Choleski decompositions. Or optimisation steps from one dual space to the original space for the same reason. Or using pseudo-gradients on partly differentiable functions in the talk by Sylvain Lecorff on a paper commented earlier in the ‘Og. I particularly liked the notion of Moreau regularisation that leads to more efficient Langevin algorithms when the target is not regular enough. Actually, the discretised diffusion itself may be geometrically ergodic without the corrective step of the Metropolis-Hastings acceptance. This obviously begs the question of an extension to Hamiltonian Monte Carlo. And to multimodal targets, possibly requiring as many normalisation factors as there are modes. So, *in fine*, a highly informative workshop, with the perfect size and the perfect crowd (which happened to be predominantly French, albeit from a community I did not have the opportunity to practice previously). Massive kudos to Marcello for putting this workshop together, esp. on a week where family major happy events should have kept him at home!

**A**s the workshop ended up in mid-afternoon, I had plenty of time for a long run with Florence Forbes down to the Avon river and back up among the deers of Ashton Court, avoiding most of the rain, all of the mountain bikes on a bike trail that sounded like trail running practice, and building enough of an appetite for the South Indian cooking of the nearby Thali Café. Brilliant!

Filed under: pictures, Statistics, Travel, University life Tagged: Ashton Court, Avon river, Bristol, England, Hamiltonian Monte Carlo, MALA, Markov chains, Moreau regularisation, multimodality, RJMCMC, south Indian cuisine, SuSTain, thali

### avernian landscapes (#6)

Filed under: Mountains, pictures, Running, Statistics, Travel Tagged: Auvergne, Besse-en-Chandesse, Bleu d'Auvergne, Cantal, cow, Fourme d'Amber, France, Laguiole, Mont-Dore, Puy-de-Dôme, Saint-Nectaire, Salers, sunrise, vacations, volcanoes

### high-dimensional stochastic simulation and optimisation in image processing [day #2]

**A**fter a nice morning run down Leigh Woods and on the muddy banks of the Avon river, I attended a morning session on hyperspectral image non-linear modelling. Topic about which I knew nothing beforehand. Hyperspectral images are 3-D images made of several wavelengths to improve their classification as a mixture of several elements. The non-linearity is due to the multiple reflections from the ground as well as imperfections in the data collection. I found this new setting of clear interest, from using mixtures to exploring Gaussian processes and Hamiltonian Monte Carlo techniques on constrained spaces… Not to mention the “debate” about using Bayesian inference versus optimisation. It was overall a day of discovery as I am unaware of the image processing community (being the outlier in this workshop!) and of their techniques. The problems mostly qualify as partly linear high-dimension inverse problems, with rather standard if sometimes hybrid MCMC solutions. (The day ended even more nicely with another long run in the fields of Ashton Court and a conference diner by the river…)

** **

Filed under: pictures, Statistics, Travel, Uncategorized, University life, Wines Tagged: Bristol, England, Hamiltonian Monte Carlo, image classification, MALA, SuSTain, variational Bayes methods

### avernian landscapes (#5)

Filed under: Mountains, pictures, Running, Statistics, Travel Tagged: Auvergne, France, Mont-Dore, Puy de Sancy, Puy-de-Dôme, vacations, volcanoes

### high-dimensional stochastic simulation and optimisation in image processing [day #1]

**E**ven though I flew through Birmingham (and had to endure the fundamental randomness of trains in Britain), I managed to reach the “High-dimensional Stochastic Simulation and Optimisation in Image Processing” conference location (in Goldney Hall Orangery) in due time to attend the (second) talk by Christophe Andrieu. He started with an explanation of the notion of *controlled Markov chain*, which reminded me of our early and famous-if-unpublished paper on controlled MCMC. (The label “controlled” was inspired by Peter Green who pointed out to us the different meanings of *controlled* in French [meaning checked or monitored] and in English . We use it here in the English sense, obviously.) The main focus of the talk was on the stability of controlled Markov chains. With of course connections with out controlled MCMC of old, for instance the case of the coerced acceptance probability. Which happened to be not that stable! With the central tool being Lyapounov functions. (Making me wonder whether or not it would make sense to envision the meta-problem of adaptively estimating the adequate Lyapounov function from the MCMC outcome.)

**A**s I had difficulties following the details of the convex optimisation talks in the afternoon, I eloped to work on my own and returned to the posters & wine session, where the small number of posters allowed for the proper amount of interaction with the speakers! Talking about the relevance of variational Bayes approximations and of possible tools to assess it, about the use of new metrics for MALA and of possible extensions to Hamiltonian Monte Carlo, about Bayesian modellings of fMRI and of possible applications of ABC in this framework. (No memorable wine to make the ‘Og!) Then a quick if reasonably hot curry and it was already bed-time after a rather long and well-filled day!z

Filed under: pictures, Statistics, Travel, Uncategorized, University life, Wines Tagged: Bristol, control, controlled MCMC, England, faux-ami, Hamiltonian Monte Carlo, image classification, Lyapounov function, MALA, Markov chains, SuSTain, variational Bayes methods

### avernian landscapes (#4)

Filed under: Mountains, pictures, Running, Travel Tagged: Auvergne, Besse-en-Chandesse, countryside, France, Puy de Sancy, Puy-de-Dôme, vacations

### capture-recapture homeless deaths

**I**n the newspaper I grabbed in the corridor to my plane today (flying to Bristol to attend the SuSTaIn image processing workshop on “High-dimensional Stochastic Simulation and Optimisation in Image Processing” where I was kindly invited and most readily accepted the invitation), I found a two-page entry on estimating the number of homeless deaths using capture-recapture. Besides the sheer concern about the very high mortality rate among homeless persons (expected lifetime, 48 years; around 7000 deaths in France between 2008 and 2010) and the dreadful realisation that there are an increasing number of kids dying in the streets, I was obviously interested in this use of capture-recapture methods as I had briefly interacted with researchers from INED working on estimating the number of (living) homeless persons about 15 years ago. Glancing at the original paper once I had landed, there was alas no methodological innovation in the approach, which was based on the simplest maximum likelihood estimate. I wonder whether or not more advanced models and [Bayesian] methods of inference could [or should] be used on such data. Like introducing covariates in the process. For instance, when conditioning the probability of (cross-)detection on the cause of death.

Filed under: Statistics, Travel, University life Tagged: Bristol, capture-recapture, covariate, death rate, generalised linear models, homeless, image processing, INED

### avernian landscapes (#3)

Filed under: Mountains, pictures, Running, Travel Tagged: Auvergne, Besse-en-Chandesse, countryside, France, Puy de Sancy, Puy-de-Dôme, vacations

### dans le noir

**Y**esterday night, we went to a very special restaurant in down-town Paris, called “dans le noir” where meals take place in complete darkness (truly “dans le noir”!). Complete in the sense it is impossible to see one’s hand and one’s glass. The waiters are blind and the experiment turns them into our guides, as we are unable to progress or eat in the dark! In addition to this highly informative experiment, it was fun to guess the food (easy!) and even more to fail miserably at guessing the colour of the wine (a white Minervois made from Syrah that tasted very much like a red, either from Languedoc-Roussillon or from Bordeaux…!) The food was fine if not outstanding (the owner told us how cooking too refined a meal led to terrible feedbacks from the customers as they could not guess what they were eating) and the wine very good (no picture for the ‘Og, obviously!). This was my daughter’s long-time choice for her 18th birthday dinner and a definitely outstanding idea! So if you have the opportunity to try one of those restaurants (in Barcelona Paseo Picasso, London Clerkenwell, New York, Paris Les Halles, or Saint-Petersbourg), I strongly suggest you to make the move. Eating will never feel the same!

Filed under: Kids, pictures, Travel, Wines Tagged: birthday, blindness, Bordeaux, dans le noir, Languedoc-Roussillon, Les Halles, Minervois, Paris

### avernian landscapes (#2)

Filed under: Mountains, pictures, Running, Travel Tagged: Auvergne, Besse-en-Chandesse, France, Puy de Sancy, Puy-de-Dôme, sunrise, vacations

### understanding the Hastings algorithm

**D**avid Minh and Paul Minh [who wrote a 2001 Applied Probability Models] have recently arXived a paper on “understanding the Hastings algorithm”. They revert to the form of the acceptance probability suggested by Hastings (1970):

where s(x,y) is a symmetric function keeping the above between 0 and 1, and q is the proposal. This obviously includes the standard Metropolis-Hastings form of the ratio, as well as Barker’s (1965):

which is known to be less efficient by accepting less often (see, e.g., Antonietta Mira’s PhD thesis). The authors also consider the alternative

which I had not seen earlier. It is a rather intriguing quantity in that it can be interpreted as (a) a simulation of y from the cutoff target corrected by reweighing the previous x into a simulation from q(x|y); (b) a sequence of two acceptance-rejection steps, each concerned with a correspondence between target and proposal for x or y. There is an obvious caveat in this representation when the target is unnormalised since the ratio may then be arbitrarily small… Yet another alternative could be proposed in this framework, namely the delayed acceptance probability of our paper with Marco and Clara, one special case being

where

is an arbitrary decomposition of the target. An interesting remark in the paper is that any Hastings representation can alternatively be written as

where k(x,y) is a (positive) symmetric function. Hence every single Metropolis-Hastings is also a delayed acceptance in the sense that it can be interpreted as a two-stage decision.

**T**he second part of the paper considers an extension of the accept-reject algorithm where a value y proposed from a density q(y) is accepted with probability

and else the current x is repeated, where M is an arbitrary constant (incl. of course the case where it is a proper constant for the original accept-reject algorithm). Curiouser and curiouser, as Alice would say! While I think I have read some similar proposal in the past, I am a wee intrigued at the appear of using only the proposed quantity y to decide about acceptance, since it does not provide the benefit of avoiding generations that are rejected. In this sense, it appears as the opposite of our vanilla Rao-Blackwellisation. (The paper however considers the symmetric version called the independent Markovian minorizing algorithm that only depends on the current x.) In the extension to proposals that depend on the current value x, the authors establish that this Markovian AR is in fine equivalent to the generic Hastings algorithm, hence providing an interpretation of the “mysterious” s(x,y) through a local maximising “constant” M(x,y). A possibly missing section in the paper is the comparison of the alternatives, albeit the authors mention Peskun’s (1973) result that exhibits the Metropolis-Hastings form as *the* optimum.

Filed under: Books, Statistics Tagged: accept-reject algorithm, acceptance probability, Barker's algorithm, delayed acceptance, Metropolis-Hastings algorithms, vanilla Rao-Blackwellisation