## Bayesian News Feeds

### AppliBUGS day celebrating Jean-Louis Foulley

**I**n case you are in Paris tomorrow and free, there will be an AppliBUGS day focussing on the contributions of our friend Jean-Louis Foulley. (And a regular contributor to the ‘Og!) The meeting takes place in the ampitheatre on second floor of ENGREF-Montparnasse (19 av du Maine, 75015 Paris, Métro Montparnasse Bienvenüe). I will give a part of the O’Bayes tutorial on alternatives to the Bayes factor.

Filed under: pictures, Statistics, University life Tagged: AppliBUGS, Bayes factor, EnGREF, Jean-Louis Foulley, Montparnasse-Bienvenüe, O'Bayes, Paris

### Edinburgh snapshot (#6)

### posterior likelihood ratio is back

*“The PLR turns out to be a natural Bayesian measure of evidence of the studied hypotheses.”*

**I**sabelle Smith and André Ferrari just arXived a paper on the posterior distribution of the likelihood ratio. This is in line with Murray Aitkin’s notion of considering the likelihood ratio

as a *prior* quantity, when contemplating the null hypothesis that θ is equal to θ0. (Also advanced by Alan Birnbaum and Arthur Dempster.) A concept we criticised (rather strongly) in our Statistics and Risk Modelling paper with Andrew Gelman and Judith Rousseau. The arguments found in the current paper in defence of the posterior likelihood ratio are quite similar to Aitkin’s:

- defined for (some) improper priors;
- invariant under observation or parameter transforms;
- more informative than tthe posterior mean of the posterior likelihood ratio, not-so-incidentally equal to the Bayes factor;
- avoiding using the posterior mean for an asymmetric posterior distribution;
- achieving some degree of reconciliation between Bayesian and frequentist perspectives, e.g. by being equal to some p-values;
- easily computed by MCMC means (if need be).

One generalisation found in the paper handles the case of* composite versus composit*e hypotheses, of the form

which brings back an earlier criticism I raised (in Edinburgh, at ICMS, where as one-of-those-coincidences, I read this paper!), namely that using the product of the marginals rather than the joint posterior is no more a standard Bayesian practice than using the data in a prior quantity. And leads to multiple uses of the data. Hence, having already delivered my perspective on this approach in the past, I do not feel the urge to “raise the flag” once again about a paper that is otherwise well-documented and mathematically rich.

Filed under: Statistics, University life Tagged: Alan Birnbaum, Arthur Dempster, Bayesian hypothesis testing, Bayesian p-values, composite hypotheses, Edinburgh, ICMS, invariance, Murray Aitkin, posterior likelihood ratio

### Edinburgh snapshot (#5)

Filed under: pictures, Running, Travel Tagged: Edinburgh, ICMS, Royal Mile, Scotland, St. Giles' cathedral

### lazy ABC

*“A more automated approach would be useful for lazy versions of ABC SMC algorithms.”*

**D**ennis Prangle just arXived the work on lazy ABC he had presented in Oxford at the i-like workshop a few weeks ago. The idea behind the paper is to cut down massively on the generation of pseudo-samples that are “too far” from the observed sample. This is formalised through a stopping rule that puts the estimated likelihood to zero with a probability 1-α(θ,x) and otherwise divide the original ABC estimate by α(θ,x). Which makes the modification unbiased when compared with basic ABC. The efficiency appears when α(θ,x) can be computed much faster than producing the entire pseudo-sample and its distance to the observed sample. When considering an approximation to the asymptotic variance of this modification, Dennis derives a optimal (in the sense of the effective sample size) if formal version of the acceptance probability α(θ,x), conditional on the choice of a “decision statistic” φ(θ,x). And of an importance function g(θ). (I do not get his Remark 1 about the case when π(θ)/g(θ) only depends on φ(θ,x), since the later also depends on x. Unless one considers a multivariate φ which contains π(θ)/g(θ) itself as a component.) This approach requires to estimate

as a function of φ: I would have thought (non-parametric) logistic regression a good candidate towards this estimation, but Dennis is rather critical of this solution.

**I** added the quote above as I find it somewhat ironical: at this stage, to enjoy laziness, the algorithm has first to go through a massive calibration stage, from the selection of the subsample [to be simulated before computing the acceptance probability α(θ,x)] to the construction of the (somewhat mysterious) decision statistic φ(θ,x) to the estimation of the terms composing the optimal α(θ,x). The most natural choice of φ(θ,x) seems to be involving subsampling, still with a wide range of possibilities and ensuing efficiencies. (The choice found in the application is somehow anticlimactic in this respect.) In most ABC applications, I would suggest using a quick & dirty approximation of the distribution of the summary statistic.

**A** slight point of perplexity about this “lazy” proposal, namely the static role of ε, which is impractical because not set in stone… As discussed several times here, the tolerance is a function of many factors incl. all the calibration parameters of the lazy ABC, rather than an absolute quantity. The paper is rather terse on this issue (see Section 4.2.2). It seems to me that playing with a large collection of tolerances may be too costly in this setting.

Filed under: Books, Statistics, University life Tagged: ABC, ABC-SMC, accelerated ABC, calibration, i-like, lazy ABC, logistic regression, University of Oxford

### Edinburgh snapshot (#4)

Filed under: pictures, Travel, University life Tagged: Edinburgh, pub, Scotland, The Guildford Arms, Waverley Station

### Le Monde puzzle [#869]

**A**n uninteresting Le Monde mathematical puzzle:

*Solve the system of equations
*

*a+b+c=16,**b+c+d=12,**d+c+e=16,**e+c+f=18,**g+c+a=15*

* for 7 different integers 1≤a,…,g**≤*9.

**I**ndeed, the final four equations determine *d=a-4, e=b+4, f=a-2, g=b-1* as functions of *a* and *b*. While forcing *5≤a*, 2*≤**b**≤5,* and 7*≤**a+b**≤15*. Hence, 5 possible values for a and 4 for b. Which makes 20 possible solutions for the system. However the fact that* a,b,c,d,e,f,g* are all different reduces considerably the possibilities. For instance, *b* must be less than *a-4*. The elimination of impossible cases leads in the end to consider *b=a-5* and *b=a-7*. And eventually to* a=8, b=3*… Not so uninteresting then. A variant of Sudoku, with open questions like what is the collection of the possible values of the five sums, i.e. of the values with one and only one existing solution? Are there cases where four equations only suffice to determine *a,b,c,d,e,f,g*?

**A**part from this integer programming exercise, a few items of relevance in this Le Monde Science & Medicine leaflet. A description of the day of a social sciences worker in front of a computer, in connection with a sociology (or sociometry) blog and a conference on Big Data in sociology at Collège de France. A tribune by the physicist Marco on data sharing (and not-sharing) illustrated by an experiment on dark matter called Cogent. And then a long interview of Matthieu Ricard, who argues about the “scientifically proven impact of meditation”, a sad illustration of the ease with which religions permeate the scientific debate [or at least the science section of Le Monde] and mingle scientific terms with religious concepts (e.g., the fusion term of “contemplative sciences”). *[As another "of those coincidences", on the same day I read this leaflet, Matthieu Ricard was the topic of one question on a radio quizz.]*

Filed under: Books, Kids, Statistics, University life Tagged: big data, Cogent, Collège de France, dark matter, Le Monde, mathematical puzzle, Matthieu Ricard, neurosciences, religions, social sciences

### June 7, 1944

*[I wrote this post a few years ago, but the 70th anniversary of the D-day brought back those memories and I thought it worth re-posting...]*

**T**his is the day I almost got un-born, not that I was born at the time (!) but my mother, then almost seven, came close to dying under the Allied bombs that obliterated Saint-Lô (Manche, western France) from the map that night, in conjunction with the D Day landing in the nearby beaches of Utah Beach and Omaha Beach. (The city was supposed to be taken by the end of June 6, but it was only on July 19 that Allied troops entered Saint-Lô.) Most of the town got destroyed under 60,000 pounds of bombs in an attempt by the Allied forces to cut access to the beaches from German reinforcements from Brittany. (Saint-Lô got the surname of “capital of the ruins” from Samuel Beckett after this bombing and it took many years to reconstruct.) My granparents and their three daughters barely went out of their house before it collapsed and had to flee the ablaze Saint-Lô with a single cartwheel to carry two suitcases and the three girls. Several times did my grandfather hide them under his leather jacket for power lines were collapsing around them…

**T**hey eventually (and obviously) made it alive out of Saint-Lô, only to be rounded up with other refugees by German troops who parked them in a field, most likely to be used as hostages. Taking advantage of the night, my grandfather managed once again to get his family away by crawling under the barriers on the darkest side of the field and they then reached (by foot) a most secluded village in the countryside where my great-grandmother was living at the time. From when I was a child, I have heard this story so many times from my mother that it is almost pictured in my brain, as if I had seen the “movie”, somehow.

Filed under: Kids Tagged: Allied troops, bombing, capital of the ruins, D Day, Saint-Lô, WW II

### computational methods for statistical mechanics [day #4]

**M**y last day at this ICMS workshop on molecular simulation started [with a double loop of Arthur's Seat thankfully avoiding the heavy rains of the previous night and then] Chris Chipot‘s magistral entry to molecular simulation for proteins with impressive slides and simulation movies, even though I could not follow the details to really understand the simulation challenges therein, just catching a few connections with earlier talks. A typical example of a cross-disciplinary gap, where the other discipline always seems to be stressing the ‘wrong” aspects. Although this is perfectly unrealistic, it would immensely to prepare talks in pairs for such interdisciplinary workshops! Then Gersende Fort presented results about convergence and efficiency for the Wang-Landau algorithm. The idea is to find the optimal rate for updating the weights of the elements of the partition towards reaching the flat histogram in minimal time. Showing massive gains on toy examples. The next talk went back to molecular biology with Jérôme Hénin‘s presentation on improved adaptive biased sampling. With an exciting notion of orthogonality aiming at finding the slowest directions in the target and putting the computational effort. He also discussed the tension between long single simulations and short repeated ones, echoing a long-going debate in the MCMC community. (He also had a slide with a picture of my first 1983 Apple IIe computer!) Then Antonietta Mira gave a broad perspective on delayed rejection and zero variance estimates. With impressive variance reductions (although some physicists then asked for reduction of order 10¹⁰!). Johannes Zimmer gave a beautiful maths talk on the connection between particle and diffusion limits (PDEs) and Wasserstein geometry and large deviations. (I did not get most of the talk, but it was nonetheless beautiful!) Bert Kappen concluded the day (and the workshop for me) by a nice introduction to control theory. Making connection between optimal control and optimal importance sampling. Which made me idly think of the following problem: what if control cannot be completely… controlled and hence involves a stochastic part? Presumably of little interest as the control would then be on the parameters of the distribution of the control.

*“The alanine dipeptide is the fruit fly of molecular simulation.”*

**T**he example of this alanine dipeptide molecule was so recurrent during the talks that it justified the above quote by Michael Allen. Not that I am more proficient in the point of studying this protein or using it as a benchmark. Or in identifying the specifics of the challenges of molecular dynamics simulation. Not a criticism of the ICMS workshop obviously, but rather of my congenital difficulty with continuous time processes!!! So I do not return from Edinburgh with a new research collaborative project in molecular dynamics (if with more traditional prospects), albeit with the perception that a minimal effort could bring me to breach the vocabulary barrier. And maybe consider ABC ventures in those (new) domains. (Although I fear my talk on ABC did not impact most of the audience!)

Filed under: Mountains, pictures, Running, Statistics, Travel, University life Tagged: ABC, Arthur's Seat, computational physics, control theory, control variate, delayed rejection sampling, Edinburgh, Highlands, ICMS, Langevin diffusion, large deviation, MCMC, molecular simulation, Monte Carlo Statistical Methods, Scotland, Wasserstein distance, zero variance importance sampling

### computational methods for statistical mechanics [day #4]

**M**y last day at this ICMS workshop on molecular simulation started [with a double loop of Arthur's Seat thankfully avoiding the heavy rains of the previous night and then] Chris Chipot‘s magistral entry to molecular simulation for proteins with impressive slides and simulation movies, even though I could not follow the details to really understand the simulation challenges therein, just catching a few connections with earlier talks. A typical example of a cross-disciplinary gap, where the other discipline always seems to be stressing the ‘wrong” aspects. Although this is perfectly unrealistic, it would immensely to prepare talks in pairs for such interdisciplinary workshops! Then Gersende Fort presented results about convergence and efficiency for the Wang-Landau algorithm. The idea is to find the optimal rate for updating the weights of the elements of the partition towards reaching the flat histogram in minimal time. Showing massive gains on toy examples. The next talk went back to molecular biology with Jérôme Hénin‘s presentation on improved adaptive biased sampling. With an exciting notion of orthogonality aiming at finding the slowest directions in the target and putting the computational effort. He also discussed the tension between long single simulations and short repeated ones, echoing a long-going debate in the MCMC community. (He also had a slide with a picture of my first 1983 Apple IIe computer!) Then Antonietta Mira gave a broad perspective on delayed rejection and zero variance estimates. With impressive variance reductions (although some physicists then asked for reduction of order 10¹⁰!). Johannes Zimmer gave a beautiful maths talk on the connection between particle and diffusion limits (PDEs) and Wasserstein geometry and large deviations. (I did not get most of the talk, but it was nonetheless beautiful!) Bert Kappen concluded the day (and the workshop for me) by a nice introduction to control theory. Making connection between optimal control and optimal importance sampling. Which made me idly think of the following problem: what if control cannot be completely… controlled and hence involves a stochastic part? Presumably of little interest as the control would then be on the parameters of the distribution of the control.

*“The alanine dipeptide is the fruit fly of molecular simulation.”*

**T**he example of this alanine dipeptide molecule was so recurrent during the talks that it justified the above quote by Michael Allen. Not that I am more proficient in the point of studying this protein or using it as a benchmark. Or in identifying the specifics of the challenges of molecular dynamics simulation. Not a criticism of the ICMS workshop obviously, but rather of my congenital difficulty with continuous time processes!!! So I do not return from Edinburgh with a new research collaborative project in molecular dynamics (if with more traditional prospects), albeit with the perception that a minimal effort could bring me to breach the vocabulary barrier. And maybe consider ABC ventures in those (new) domains. (Although I fear my talk on ABC did not impact most of the audience!)

Filed under: Mountains, pictures, Running, Statistics, Travel, University life Tagged: ABC, Arthur's Seat, computational physics, control theory, control variate, delayed rejection sampling, Edinburgh, Highlands, ICMS, Langevin diffusion, large deviation, MCMC, molecular simulation, Monte Carlo Statistical Methods, Scotland, Wasserstein distance, zero variance importance sampling

### Ediburgh snapshot (#3)

### Ediburgh snapshot (#3)

### computational methods for statistical mechanics [day #3]

**T**he third day [morn] at our ICMS workshop was dedicated to path sampling. And rare events. Much more into [my taste] Monte Carlo territory. The first talk by Rosalind Allen looked at reweighting trajectories that are not in an equilibrium or are missing the Boltzmann [normalizing] constant. Although the derivation against a calibration parameter looked like the primary goal rather than the tool for constant estimation. Again papers in *J. Chem. Phys.*! And a potential link with ABC raised by Antonietta Mira… Then Jonathan Weare discussed stratification. With a nice trick of expressing the normalising constants of the different terms in the partition as solution(s) of a Markov system

Because the stochastic matrix **M** is easier (?) to approximate. Valleau’s and Torrie’s umbrella sampling was a constant reference in this morning of talks. Arnaud Guyader’s talk was in the continuation of Toni Lelièvre’s introduction, which helped a lot in my better understanding of the concepts. Rephrasing things in more statistical terms. Like the distinction between equilibrium and paths. Or bias being importance sampling. Frédéric Cérou actually gave a sort of second part to Arnaud’s talk, using importance splitting algorithms. Presenting an algorithm for simulating rare events that sounded like an opposite nested sampling, where the goal is to get *down* the target, rather than *up*. Pushing particles away from a current level of the target function with probability ½. Michela Ottobre completed the series with an entry into diffusion limits in the Roberts-Gelman-Gilks spirit when the Markov chain is not yet stationary. In the transient phase thus.

Filed under: Mountains, pictures, Running, Statistics, Travel, University life Tagged: ABC, Arthur's Seat, computational physics, Edinburgh, extreme value theory, Highlands, ICMS, MCMC, molecular simulation, Monte Carlo Statistical Methods, NIPS 2014, path sampling, rare events, Scotland, stratification

### computational methods for statistical mechanics [day #3]

**T**he third day [morn] at our ICMS workshop was dedicated to path sampling. And rare events. Much more into [my taste] Monte Carlo territory. The first talk by Rosalind Allen looked at reweighting trajectories that are not in an equilibrium or are missing the Boltzmann [normalizing] constant. Although the derivation against a calibration parameter looked like the primary goal rather than the tool for constant estimation. Again papers in *J. Chem. Phys.*! And a potential link with ABC raised by Antonietta Mira… Then Jonathan Weare discussed stratification. With a nice trick of expressing the normalising constants of the different terms in the partition as solution(s) of a Markov system

Because the stochastic matrix **M** is easier (?) to approximate. Valleau’s and Torrie’s umbrella sampling was a constant reference in this morning of talks. Arnaud Guyader’s talk was in the continuation of Toni Lelièvre’s introduction, which helped a lot in my better understanding of the concepts. Rephrasing things in more statistical terms. Like the distinction between equilibrium and paths. Or bias being importance sampling. Frédéric Cérou actually gave a sort of second part to Arnaud’s talk, using importance splitting algorithms. Presenting an algorithm for simulating rare events that sounded like an opposite nested sampling, where the goal is to get *down* the target, rather than *up*. Pushing particles away from a current level of the target function with probability ½. Michela Ottobre completed the series with an entry into diffusion limits in the Roberts-Gelman-Gilks spirit when the Markov chain is not yet stationary. In the transient phase thus.

Filed under: Mountains, pictures, Running, Statistics, Travel, University life Tagged: ABC, Arthur's Seat, computational physics, Edinburgh, extreme value theory, Highlands, ICMS, MCMC, molecular simulation, Monte Carlo Statistical Methods, NIPS 2014, path sampling, rare events, Scotland, stratification

### Edinburgh snapshot (#2)

### Edinburgh snapshot (#2)

### computational methods for statistical mechanics [day #2]

**T**he last “tutorial” talk at this ICMS workshop ["at the interface between mathematical statistics and molecular simulation"] was given by Tony Lelièvre on adaptive bias schemes in Langevin algorithms and on the parallel replica algorithm. This was both very interesting because of the potential for connections with my “brand” of MCMC techniques and rather frustrating as I felt the intuition behind the physical concepts like free energy and metastability was almost within my reach! The most manageable time in Tony’s talk was the illustration of the concepts through a mixture posterior example. Example that I need to (re)read further to grasp the general idea. (And maybe the book on Free Energy Computations Tony wrote with Mathias Rousset et Gabriel Stoltz.) A definitely worthwhile talk that I hope will get posted on line by ICMS. The other talks of the day were mostly of a free energy nature, some using optimised bias in the Langevin diffusion (except for Pierre Jacob who presented his non-negative unbiased estimation impossibility result).

Filed under: Mountains, pictures, Running, Statistics, Travel, University life Tagged: ABC, Arthur's Seat, computational physics, Edinburgh, free energy, ICMS, MCMC, molecular simulation, Monte Carlo Statistical Methods, NIPS 2014, Scotland, unbiasedness

### computational methods for statistical mechanics [day #2]

**T**he last “tutorial” talk at this ICMS workshop ["at the interface between mathematical statistics and molecular simulation"] was given by Tony Lelièvre on adaptive bias schemes in Langevin algorithms and on the parallel replica algorithm. This was both very interesting because of the potential for connections with my “brand” of MCMC techniques and rather frustrating as I felt the intuition behind the physical concepts like free energy and metastability was almost within my reach! The most manageable time in Tony’s talk was the illustration of the concepts through a mixture posterior example. Example that I need to (re)read further to grasp the general idea. (And maybe the book on Free Energy Computations Tony wrote with Mathias Rousset et Gabriel Stoltz.) A definitely worthwhile talk that I hope will get posted on line by ICMS. The other talks of the day were mostly of a free energy nature, some using optimised bias in the Langevin diffusion (except for Pierre Jacob who presented his non-negative unbiased estimation impossibility result).

Filed under: Mountains, pictures, Running, Statistics, Travel, University life Tagged: ABC, Arthur's Seat, computational physics, Edinburgh, free energy, ICMS, MCMC, molecular simulation, Monte Carlo Statistical Methods, NIPS 2014, Scotland, unbiasedness