## Bayesian News Feeds

### a day of mourning

*“Religion, a mediaeval form of unreason, when combined with modern weaponry becomes a real threat to our freedoms. This religious totalitarianism has caused a deadly mutation in the heart of Islam and we see the tragic consequences in Paris today. I stand with Charlie Hebdo, as we all must, to defend the art of satire, which has always been a force for liberty and against tyranny, dishonesty and stupidity. ‘Respect for religion’ has become a code phrase meaning ‘fear of religion.’ Religions, like all other ideas, deserve criticism, satire, and, yes, our fearless disrespect.”* Salman Rushdie

Filed under: Books Tagged: Charlie Hebdo, Charpentier, Je suis Charlie, leçons de ténèbres, Salman Rushdie

### how many modes in a normal mixture?

**A**n interesting question I spotted on Cross Validated today: How to tell if a mixture of Gaussians will be multimodal? Indeed, there is no known analytical condition on the parameters of a fully specified k-component mixture for the modes to number k or less than k… Googling around, I immediately came upon this webpage by Miguel Carrera-Perpinan, who studied the issue with Chris Williams when writing his PhD in Edinburgh. And upon this paper, which not only shows that

- unidimensional Gaussian mixtures with k components have at most k modes;
- unidimensional non-Gaussian mixtures with k components may have more than k modes;
- multidimensional mixtures with k components may have more than k modes.

but also provides ways of finding all the modes. Ways which seem to reduce to using EM from a wide variety of starting points (an EM algorithm set in the sampling rather than in the parameter space since all parameters are set!). Maybe starting EM from each mean would be sufficient. I still wonder if there are better ways, from letting the variances decrease down to zero until a local mode appear, to using some sort of simulated annealing…

**Edit:** Following comments, let me stress this is not a statistical issue in that the parameters of the mixture are set and known and there is no observation(s) from this mixture from which to estimate the number of modes. The mathematical problem is to determine how many local maxima there are for the function

Filed under: Books, Kids, Statistics, University life Tagged: Chris Williams, EM algorithm, Miguel Carrera-Perpinan, mixture estimation, modes of a mixture, Scotland, University of Edinburgh

### ABC by population annealing

**T**he paper “Bayesian Parameter Inference and Model Selection by Population Annealing in System Biology” by Yohei Murakami got published in PLoS One last August but I only became aware of it when ResearchGate pointed it out to me [by mentioning one of our ABC papers was quoted there].

*“We are recommended to try a number of annealing schedules to check the influence of the schedules on the simulated data (…) As a whole, the simulations with the posterior parameter ensemble could, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference.”*

Population annealing is a notion introduced by Y Iba, the very same IBA who introduced the notion of population Monte Carlo that we studied in subsequent papers. It reproduces the setting found in many particle filter papers of a sequence of (annealed or rather tempered) targets ranging from an easy (i.e., almost flat) target to the genuine target, and of an update of a particle set by MCMC moves and reweighing. I actually have trouble perceiving the difference with other sequential Monte Carlo schemes as those exposed in Del Moral, Doucet and Jasra (2006, Series B). And the same is true of the ABC extension covered in this paper. (Where the annealed intermediate targets correspond to larger tolerances.) This sounds like a traditional ABC-SMC algorithm. Without the adaptive scheme on the tolerance ε found e.g. in Del Moral et al., since the sequence is set in advance. [However, the discussion about the implementation includes the above quote that suggests a vague form of cross-validated tolerance construction]. The approximation of the marginal likelihood also sounds standard, the marginal being approximated by the proportion of accepted pseudo-samples. Or more exactly by the sum of the SMC weights at the end of the annealing simulation. This actually raises several questions: (a) this estimator is always between 0 and 1, while the marginal likelihood is not restricted [but this is due to a missing 1/ε in the likelihood estimate that cancels from both numerator and denominator]; (b) seeing the kernel as a non-parametric estimate of the likelihood led me to wonder why different ε could not be used in different models, in that the pseudo-data used for each model under comparison differs. If we were in a genuine non-parametric setting the bandwidth would be derived from the pseudo-data.

*“Thus, Bayesian model selection by population annealing is valid.”*

The discussion about the use of ABC population annealing somewhat misses the point of using ABC, which is to approximate the genuine posterior distribution, to wit the above quote: that the ABC Bayes factors favour the correct model in the simulation does not tell anything about the degree of approximation wrt the original Bayes factor. [The issue of non-consistent Bayes factors does not apply here as there is no summary statistic applied to the few observations in the data.] Further, the magnitude of the variability of the values of this Bayes factor as ε varies, from 1.3 to 9.6, mostly indicates that the numerical value is difficult to trust. (I also fail to explain the huge jump in Monte Carlo variability from 0.09 to 1.17 in Table 1.) That this form of ABC-SMC improves upon the basic ABC rejection approach is clear. However it needs to build some self-control to avoid arbitrary calibration steps and reduce the instability of the final estimates.

*“The weighting function is set to be large value when the observed data and the simulated data are ‘‘close’’, small value when they are ‘‘distant’’, and constant when they are ‘‘equal’’.”*

The above quote is somewhat surprising as the estimated likelihood f(xobs|xobs,θ) is naturally constant when xobs=xsim… I also failed to understand how the model intervened in the indicator function used as a default ABC kernel

Filed under: Statistics, University life Tagged: ABC, ABC model choice, ABC-SMC, evidence, marginal likelihood, sequential Monte Carlo, simulated annealing, tempering, tolerance

### O-Bayes15 [registration & call for papers]

**B**oth registration and call for papers have now been posted on the webpage of the 11th International Workshop on Objective Bayes Methodology, aka O-Bayes 15, that will take place in Valencia next June 1-5. The spectrum of the conference is quite wide, as reflected by the range of speakers. In addition, this conference is dedicated to our friend Susie Bayarri, to celebrate her life and contributions to Bayesian Statistics. And in continuation of the morning jog in the memory of George Casella organised by Laura Ventura in Padova, there will be a morning jog for Susie. So register for the meeting and bring your running shoes!

Filed under: Kids, pictures, Statistics, Travel, University life Tagged: George Casella, O-Bayes 2015, Padova, registration, running, Spain, Susie Bayarri, València

### musical break

**D**uring the Yule break, I listened mostly two CDs, the 2013 If you wait, by London Grammar, and The shape of a broken heart, by Imany. Both were unexpected discoveries, brought to me by family members, but I enjoyed those tremendously!

Filed under: Books, Kids Tagged: album, Imany, London Grammar, songs, video, Yule

### the travelling salesman

**A** few days ago, I was grading my last set of homeworks for the MCMC graduate course I teach to both Dauphine and ENSAE graduate students. A few students had chosen to write a travelling salesman simulated annealing code (Exercice 7.22 in Monte Carlo Statistical Methods) and one of them included this quote

*“And when I saw that, I realized that selling was the greatest career a man could want. ‘Cause what could be more satisfying than to be able to go, at the age of eighty-four, into twenty or thirty different cities, and pick up a phone, and be remembered and loved and helped by so many different people ?”*

Arthur Miller, *Death of a Salesman*

which was a first!

Filed under: Statistics Tagged: Arthur Miller, Death of a Salesman, ENSAE, exercises, homework, MCMC, Monte Carlo Statistical Methods, travelling salesman Concorde, Université Paris Dauphine

### 2014 in review

The WordPress.com stats helper monkeys prepared a 2014 annual report for the ‘Og…

.. and among the collected statistics for 2014, what I found most amazing are the three accesses from Greenland and the one access from Afghanistan!

Click here to see the complete report. (Assuming you have nothing better to do on Boxing day…)

Filed under: Statistics Tagged: annual report, blogging, Boxing Day, Happy New Year, Wordpress

### the slow regard of silent things

**A**s mentioned previously, I first bought this book thinking it was the third and final volume in the *Kingkiller’s Chronicles*. Hence I was more than disappointed when Dan warned me that it was instead a side-story about Auri, an important but still secondary character in the story. More than disappointed as I thought Patrick Rothfuss was following the frustrating path of other fantasy authors with unfinished series (like Robert Jordan and George R.R. Martin) to write shorter novels set in their universe and encyclopedias instead of focussing on the real thing! However, when I started reading it, I was so startled by the novelty of the work, the beauty of the language, the alien features of the story or lack thereof, that I forgot about my grudge. I actually finished this short book very early a few mornings past Christmas, after a mild winter storm had awaken me for good. And look forward re-reading it soon.

*“Better still, the slow regard of silent things had wafted off the moisture in the air.”*

This is a brilliant piece of art, much more a *poème en prose* than a short story. There is no beginning and no end, no purpose and no rationale to most of Auri’s actions, and no direct connection with the *Kingkiller’s Chronicles* story other than the fact that it takes place in or rather below the University. And even less connection with the plot. So this book may come as a huge disappointment to most readers of the series, as exemplified by the numerous negative comments found on amazon.com and elsewhere. Especially those looking for clues about the incoming (?) volume. Or for explanations of past events… Despite all this, or because of it, I enjoyed the book immensely, in a way completely detached from the pleasure I took in reading *Kingkiller’s Chronicles*. There is genuine poetry in the repetition of words, in the many alliterations, in the saccade of unfinished sentences, in the symmetry of Auri’s world, in the making of soap and in the making of candles, in the naming and unaming of objects. Poetry and magic, even though it is not necessarily the magic found in the *Kingkiller’s Chronicles*. *The Slow Regard of Silent Things* is simply a unique book, an outlier in the fantasy literature, a finely tuned read that shows how much of a wordsmith Rothfuss can be, and a good enough reason to patiently wait for the third volume: *“She could not rush and neither could she be delayed. Some things were simply too important.”*

Filed under: Books, Kids Tagged: Auri, literature, Patrick Rothfuss, poème en prose, poetry, The Name of the Wind, The Slow Regard of Silent Things, The Wise Man's Fear

### foie gras fois trois

**A**s New Year’s Eve celebrations are getting quite near, newspapers once again focus on related issues, from the shortage of truffles, to the size of champagne bubbles, to the prohibition of foie gras. Today, I noticed an headline in *Le Monde* about a “huge increase in French people against force-fed geese and ducks: 3% more than last year are opposed to this practice”. Now, looking at the figures, it is based on a survey of 1,032 adults, out of which 47% were against. From a purely statistical perspective, this is not highly significant since

is compatible with the null hypothesis N(0,1) distribution.

Filed under: Statistics, Wines Tagged: champagne, foie gras, goose liver, Le Monde, significance test, survey sampling, truffles

### top posts for 2014

Here are the most popular entries for 2014:

17 equations that changed the World (#2) 995 Le Monde puzzle [website] 992 “simply start over and build something better” 991 accelerating MCMC via parallel predictive prefetching 990 Bayesian p-values 960 posterior predictive p-values 849 Bayesian Data Analysis [BDA3] 846 Bayesian programming [book review] 834 Feller’s shoes and Rasmus’ socks [well, Karl’s actually…] 804 the cartoon introduction to statistics 803 Asymptotically Exact, Embarrassingly Parallel MCMC 730 Foundations of Statistical Algorithms [book review] 707 a brief on naked statistics 704 In{s}a(ne)!! 682 the demise of the Bayes factor 660 Statistical modeling and computation [book review] 591 bridging the gap between machine learning and statistics 587 new laptop with ubuntu 14.04 574 Bayesian Data Analysis [BDA3 – part #2] 570 MCMC on zero measure sets 570 Solution manual to Bayesian Core on-line 567 Nonlinear Time Series just appeared 555 Sudoku via simulated annealing 538 Solution manual for Introducing Monte Carlo Methods with R 535 future of computational statistics 531What I appreciate from that list is that (a) book reviews [of stats books] get a large chunk (50%!) of the attention and (b) my favourite topics of Bayesian testing, parallel MCMC and MCMC on zero measure sets made it to the top list. Even the demise of the Bayes factor that was only posted two weeks ago!

Filed under: Books, R, Statistics, University life Tagged: book reviews, Le Monde, simulated annealing, Ubuntu 14.04

### partly virtual meetings

A few weeks ago, I read in the NYT an article about the American Academy of Religion cancelling its 2021 annual meeting as a sabbatical year, for environmental reasons.

*“We could choose to not meet at a huge annual meeting in which we take over a city. Every year, each participant going to the meeting uses a quantum of carbon that is more than considerable. Air travel, staying in hotels, all of this creates a way of living on the Earth that is carbon intensive. It could be otherwise.”*

While I am not in the least interested in the conference or in the topics covered by this society or yet in the benevolent religious activities suggested as a substitute, the notion of cancelling the behemoths that are our national and international academic meetings holds some appeal. I have posted several times on the topic, especially about JSM, and I have no clear and definitive answer to the question. Still, there lies a lack of efficiency on top of the environmental impact that we could and should try to address. As I was thinking of those issues in the past week, I made another of my numerous “carbon footprints” by attending NIPS across the Atlantic for two workshops than ran in parallel with about twenty others. And hence could have taken place in twenty different places. Albeit without the same exciting feeling of constant intellectual simmering. And without the same mix of highly interactive scholars from all over the planet. (Although the ABC in Montréal workshop seemed predominantly European!) Since workshops are in my opinion the most profitable type of meeting, I would like to experiment with a large meeting made of those (focussed and intense) workshops in such a way that academics would benefit without travelling long distances across the World. One idea would be to have local nodes where a large enough group of researchers could gather to attend video-conferences given from any of the other nodes and to interact locally in terms of discussions and poster presentations. This should even increase the feedback on selected papers as small groups would more readily engage into discussing and criticising papers than a huge conference room. If we could build a World-wide web (!) of such nodes, we could then dream of a non-stop conference, with no central node, no gigantic conference centre, no terrifying beach-ressort…

Filed under: Kids, pictures, Statistics, Travel, University life Tagged: ABC in Montréal, Benidorm, carbon impact, flight, Montréal, NIPS, online meeting, Statistics conference, travel support, world meeting

### animal photograph of the year

### the dark defiles

**T**he final and long-awaited volume of a series carries so much expectation that it more often than not ends up disappointing [me]. *The Dark Defiles* somewhat reluctantly falls within this category… This book is the third instalment of Richard K. Morgan’s fantasy series, *A Land Fit for Heroes*. Of which I liked mostly the first volume, *The Steel Remains*. When considering that this first book came out in January 2009, about six years ago, this may explains for the somewhat desultory tone of *The Dark Defiles.* As well as the overwhelming amount of info-dump needed to close the many open threads about the nature of the *Land Fit for Heroes*.

*“They went. They dug. Found nothing and came back, mostly in the rain.”*

*[Warning: some spoilers in the following!]* The most striking imbalance in the story is the rather mundane pursuits of the three major heroes, from finding an old sword to avenging fallen friends here and there, against the threat of an unravelling of the entire Universe and of the disappearance of the current cosmology. In addition, the absolute separation maintained by Morgan between Archeth and Ringil kills some of the alchemy of the previous books and increases the tendency to boring inner monologues. The volume is much, much more borderline science-fiction than the previous ones, which obviously kills some of the magic, given that the highest powers that be sound like a sort of meta computer code that eventually gives Ringil *the* ultimate decision. As often, this mix between fantasy and science-fiction is not much to my taste, since it gives too much power to the foreign machines, *the Helmsmen*, which sound like they are driving the main human players for very long term goals. And which play too often *deus ex machina* to save the “heroes” from unsolvable situations. Overall a wee bit of a lengthy book, with a story coming to an unexpected end in the very final pages, leaving some threads unexplained and some feeling that style prevailed over story. But nonetheless a page turner in its second half.

Filed under: Books Tagged: A Land Fit for Heroes, book reviews, heroic fantasy, Richard K. Morgan, science fiction, The Dark Defiles, trilogy

### Quarta Família: Héptagno

### first semester notes

### testing MCMC code

**A** title that caught my attention on arXiv: *testing MCMC code* by Roger Grosse and David Duvenaud. The paper is in fact a tutorial adapted from blog posts written by Grosse and Duvenaud, on the blog of the Harvard Intelligent Probabilistic Systems group. The purpose is to write code in such a modular way that (some) conditional probability computations can be tested. Using my favourite Gibbs sampler for the mixture model, they advocate computing the ratios

to make sure they are exactly identical. (Where x denotes the part of the parameter being simulated and z anything else.) The paper also mentions an older paper by John Geweke—of which I was curiously unaware!—leading to another test: consider iterating the following two steps:

- update the parameter θ given the current data x by an MCMC step that preserves the posterior p(θ|x);
- update the data x given the current parameter value θ from the sampling distribution p(x|θ).

Since both steps preserve the joint distribution p(x,θ), values simulated from those steps should exhibit the same properties as a forward production of (x,θ), i.e., simulating from p(θ) and then from p(x|θ). So with enough simulations, comparison tests can be run. (Andrew has a very similar proposal at about the same time.) There are potential limitations to the first approach, obviously, from being unable to write the full conditionals [an ABC version anyone?!] to making a programming mistake that keep both ratios equal [as it would occur if a Metropolis-within-Gibbs was run by using the ratio of the joints in the acceptance probability]. Further, as noted by the authors it only addresses the mathematical correctness of the code, rather than the issue of whether the MCMC algorithm mixes well enough to provide a pseudo-iid-sample from p(θ|x). (Lack of mixing that could be spotted by Geweke’s test.) But it is so immediately available that it can indeed be added to every and all simulations involving a conditional step. While Geweke’s test requires re-running the MCMC algorithm altogether. Although clear divergence between an iid sampling from p(x,θ) and the Gibbs version above could appear fast enough for a stopping rule to be used. In fine, a worthwhile addition to the collection of checkings and tests built across the years for MCMC algorithms! (Of which the trick proposed by my friend Tobias Rydén to run *first* the MCMC code with n=0 observations in order to recover *the prior* p(θ) remains my favourite!)

Filed under: Books, Statistics, University life Tagged: ABC, convergence assessment, Geweke's test, Gibbs sampling, John Geweke, MCMC, Monte Carlo Statistical Methods, prior distributions, simulation

### The Hobbit (once upon a very long time…)

*“Will you follow me, one last time?”*

With my daughter, we completed our Xmas Tolkien cycle by going together to see *The battle of the five armies*. As several have noted before me, the best thing I can say about this Hobbit series is that it is now… over! Just like the previous two instalments, watching Peter Jackson’s grand finale was mostly enjoyable, but mainly for the same reasons one enjoys visiting a venerable great-aunt once a year around Christmas, namely for bringing back memories of good times and shared laughs. Indeed, Jackson managed to link both sagas through his central character of Gandalf who, while overly fond of raised eyebrows and mischievous eyes, is certainly the most compelling character all over. While the plot stretched too thinly to keep me enthralled, as I could not remember why the orcs and goblins were converging to Erebor at the same time as the elves and dwarves and men of Dale (unless it was to justify the future name of the battle?!), I soon got battle-weary of the repeated clashes between the various armies which sounded like straight copies from on-line war games and even more of the half-dozen duels, while the rescue of Gandalf from Dol Gurdur is unbearably clumsy, with an apocryphal appearance of the Nazguls. As too often in the story, the giant eagles were so instrumental to victory that one could only wonder why they had not been around from the start.

The comical parts are much sparser here than in the previous movies: hardly any screen time for Radagast’s rabbits, thank Sauron!, or for the jovial Dain with his great Scottish brogue and his war[t]hog opening, or yet for Thranduil’s moose to show its major advantage in battle, a few steps before being shot down, or for the war mountain goats who appeared then vanished at the moment of direst need, or for Bard to find a pre-historical skateboard. I also noted that the [dumb] orgs managed to invent a precursor of Chappe’s telegraph that alas could only transmit one symbol [since it was always taking the same shape!], that Legolas recreated the Matrix by walking on a disintegrating bridge, and that Thorin turned on gravity for a few crucial seconds in a movie where most characters seem to have no issue with falling, jumping or fighting without the slightest consideration for mechanics, with a strong tendency for characters to head-butt into walls…

*“What this adaptation of “The Hobbit” can’t avoid by its final instalment is its predictability and hollow foundations.” NYT, Dec. 16, 2014*

Other features I did not enjoy much: Thorin sulked way too long, Alferid outlasted its stay on screen by about 144 minutes, only to vanish unexpectedly, Bilbo seemed lost at the margins most of the movie, while the love story between Kili and Tauriel was really one addition too many to Tolkien’s book. The search for variety in the steeds of the various armies made me almost wish for more races on the battle-field as we could then have seen fighters on giant moles or on battle-hens… And everyone could have done without the “Dune moment”, with giant earth-worms breaking tunnels only to return to oblivion. Anyway, we have now been “*There and Back Again”* and can now settle in our own hobbit-hole to re-read the books and enjoy a certain nostalgia about the days where we could imagine on our own how Bilbo, Gandalf or Thorin would look like, while humming “Song of the Misty Mountains”…

Filed under: Books, Kids, pictures Tagged: movie review, New Zealand, Peter Jackson, Song of the Misty Mountains, The battle of the five armies, The Hobbit, The Lord of the Rings

### EP as a way of life (aka Life of EP)

**W**hen Andrew was in Paris, we discussed at length about using EP for handling big datasets in a different way than running parallel MCMC. A related preprint came out on arXiv a few days ago, with an introduction on Andrews’ blog. (Not written two months in advance as most of his entries!)

The major argument in using EP in a large data setting is that the approximation to the true posterior can be build using one part of the data at a time and thus avoids handling the entire likelihood function. Nonetheless, I still remain mostly agnostic about using EP and a seminar this morning at CREST by Guillaume Dehaene and Simon Barthelmé (re)generated self-interrogations about the method that hopefully can be exploited towards the future version of the paper.

One of the major difficulties I have with EP is about the nature of the resulting approximation. Since it is chosen out of a “nice” family of distributions, presumably restricted to an exponential family, the optimal approximation will remain within this family, which further makes EP sound like a specific variational Bayes method since the goal is to find the family member the closest to the posterior in terms of Kullback-Leibler divergence. (Except that the divergence is the opposite one.) I remain uncertain about what to do with the resulting solution, as the algorithm does not tell me how close this solution will be from the true posterior. Unless one can use it as a pseudo-distribution for indirect inference (a.k.a., ABC)..?

Another thing that became clear during this seminar is that the decomposition of the target as a product is completely arbitrary, i.e., does not correspond to an feature of the target other than the later being the product of those components. Hence, the EP partition could be adapted or even optimised within the algorithm. Similarly, the parametrisation could be optimised towards a “more Gaussian” posterior. This is something that makes EP both exciting as opening many avenues for experimentation and fuzzy as its perceived lack of goal makes comparing approaches delicate. For instance, using MCMC or HMC steps to estimate the parameters of the tilted distribution is quite natural in complex settings but the impact of the additional approximation must be gauged against the overall purpose of the approach.

Filed under: Books, Statistics, University life Tagged: cavity distribution, CREST, data partitioning, EP, expectation-propagation, Kullback-Leibler divergence, large data problems, parallel processing