## Bayesian News Feeds

### a statistical test for nested sampling

**A** new arXival on nested sampling: “A statistical test for nested sampling algorithms” by Johannes Buchner. The point of the test is to check if versions of the nested sampling algorithm that fail to guarantee increased likelihood (or nesting) at each step are not missing parts of the posterior mass. and hence producing biased evidence approximations. This applies to MultiNest for instance. This version of nest sampling evaluates the above-threshold region by drawing hyper-balls around the remaining points. A solution which is known to fail in one specific but meaningful case. Buchner’s arXived paper proposes an hyper-pyramid distribution for which the volume of any likelihood constrained set is known. Hence allowing for a distribution test like Kolmogorov-Smirnov. Confirming the findings of Beaujean and Caldwell (2013). The author then proposes an alternative to MultiNest that is more robust but also much more costly as it computes distances between all pairs of bootstrapped samples. This solution passes the so-called “shrinkage test”, but it is orders of magnitude less efficient than MultiNest. And also simply shows that its coverage is fine for a specific target rather than all possible targets. I wonder if a solution to the problem is at all possible given that evaluating a support or a convex hull is a complex problem which complexity explodes with the dimension.

Filed under: Books, Statistics, University life Tagged: complexity, evidence, Kolmogorov-Smirnov distance, Multinest, nested sampling, shrinkage test

### ABC in Sydney [guest post #2]

*[Here is a second guest post on the *ABC in Sydney* workshop, written by Chris Drovandi]*

**F**irst up Dennis Prangle presented his recent work on “Lazy ABC”, which can speed up ABC by potentially abandoning model simulations early that do not look promising. Dennis introduces a continuation probability to ensure that the target distribution of the approach is still the ABC target of interest. In effect, the ABC likelihood is estimated to be 0 if early stopping is performed otherwise the usual ABC likelihood is inflated by dividing by the continuation probability, ensuring an unbiased estimator of the ABC likelihood. The drawback is that the ESS (Dennis uses importance sampling) of the lazy approach will likely be less than usual ABC for a fixed number of simulations; but this should be offset by the reduction in time required to perform said simulations. Dennis also presented some theoretical work for optimally tuning the method, which I need more time to digest.

**T**his was followed by my talk on Bayesian indirect inference methods that use a parametric auxiliary model (a slightly older version here). This paper has just been accepted by Statistical Science.

**M**orning tea was followed by my PhD student, Brenda Vo, who presented an interesting application of ABC to cell spreading experiments. Here an estimate of the diameter of the cell population was used as a summary statistic. It was noted after Brenda’s talk that this application might be a good candidate for Dennis’ Lazy ABC idea. This talk was followed by a much more theoretical presentation by Pierre del Moral on how particle filter methodologies can be adapted to the ABC setting and also a general framework for particle methods.

**F**ollowing lunch, Guilherme Rodrigues presented a hierarchical Gaussian Process model for kernel density estimation in the presence of different subgroups. Unfortunately my (lack of) knowledge on non-parametric methods prevents me from making any further comment except that the model looked very interesting and ABC seemed a good candidate for calibrating the model. I look forward to the paper appearing on-line.

**T**he next presentation was by Gael Martin who spoke about her research on using ABC for estimation of complex state space models. This was probably my favourite talk of the day, and not only because it is very close to my research interests. Here the score of the Euler discretised approximation of the generative model was used as summary statistics for ABC. From what I could gather, it was demonstrated that the ABC posterior based on the score or the MLE of the auxiliary model were the same in the limit as ε 0 (unless I have mis-interpreted). This is a very useful result in itself; using the score to avoid an optimisation required for the MLE can save a lot of computation. The improved approximations of the proposed approach compared with the results that use the likelihood of the Euler discretisation were quite promising. I am certainly looking forward to this paper coming out.

**M**att Moores drew the short straw and had the final presentation on the Friday afternoon. Matt spoke about this paper (an older version is available here), of which I am now a co-author. Matt’s idea is that doing some pre-simulations across the prior space and determining a mapping between the parameter of interest and the mean and variance of the summary statistic can significantly speed up ABC for the Potts model, and potentially other ABC applications. The results of the pre-computation step are used in the main ABC algorithm, which no longer requires simulation of pseudo-data but rather a summary statistic can be simulated from the fitted auxiliary model in the pre-processing step. Whilst this approach does introduce a couple more layers of approximation, the gain in computation time was up to two orders of magnitude. The talks by Matt, Gael and myself gave a real indirect inference flavour to this year’s ABC in…

Filed under: pictures, Statistics, University life Tagged: abc-in-sydney, Australia, Chris Drovandi, Sydney

### Cancún, ISBA 2014 [day #3]

…already Thursday, our [early] departure day!, with an nth (!) non-parametric session that saw [the newly elected ISBA Fellow!] Judith Rousseau present an ongoing work with Chris Holmes on the convergence or non-convergence conditions for a Bayes factor of a non-parametric hypothesis against another non-parametric. I wondered at the applicability of this test as the selection criterion in ABC settings, even though having an iid sample to start with is a rather strong requirement.

**S**witching between a scalable computation session with Alex Beskos, who talked about adaptive Langevin algorithms for differential equations, and a non-local prior session, with David Rossell presenting a smoother way to handle point masses in order to accommodate frequentist coverage. Something we definitely need to discuss the next time I am in Warwick! Although this made me alas miss both the first talk of the non-local session by Shane Jensen the final talk of the scalable session by Doug Vandewrken where I happened to be quoted (!) for my warning about discretising Markov chains into non-Markov processes. In the 1998 JASA paper with Chantal Guihenneuc.

**A**fter a farewell meal of ceviche with friends in the sweltering humidity of a local restaurant, I attended [the newly elected ISBA Fellow!] Maria Vanucci’s talk on her deeply involved modelling of fMRI. The last talk before the airport shuttle was François Caron’s description of a joint work with Emily Fox on a sparser modelling of networks, along with an auxiliary variable approach that allowed for parallelisation of a Gibbs sampler. François mentioned an earlier alternative found in machine learning where all components of a vector are updated simultaneously conditional on the previous avatar of the other components, e.g. simulating (x’,y’) from π(x’|y) π(y’|x) which does not produce a convergent Markov chain. At least not convergent to the right stationary. However, running a quick [in-flight] check on a 2-d normal target did not show any divergent feature, when compared with the regular Gibbs sampler. I thus wonder at what can be said about the resulting target or which conditions are need for divergence. A few scribbles later, I realised that the 2-d case was the exception, namely that the stationary distribution of the chain is the product of the marginal. However, running a 3-d example with an auto-exponential distribution in the taxi back home, I still could not spot a difference in the outcome.

Filed under: pictures, Statistics, Travel, University life Tagged: Cancún, ISBA, Langevin MCMC algorithm, MCMC algorithms, non-local priors, University of Warwick

### Off from Cancun [los scientificos Maya]

**T**he flight back from ISBA 2014 was not as smooth as the flight in: it took one hour for the shuttle to take us to the airport thanks to a driver posing as a touristic guide [who needs a guide when going home?!] and droning on and on about Cancún and the Maya heritage [as far as I could guess from his Spanish]. Learning at the airport that out flight to Mexico City was delayed, then too delayed for us to make the connection, with no hotel room available there, then suggesting to the desk personal every possible European city to learn the flight had left or was about to leave, missing London by an hair, thanks to our droning friend on the scientific Mayas, and eventually being bused to the hotel airport, too far from the last poster session we could have attended!, and leaving early the next morning to Atlanta and then Paris. Which means we could have stayed for most of the remaining sessions and been back home at about the same time…

Filed under: pictures, Statistics, Travel, University life Tagged: Aero Mexico, Cancún, flight, ISBA 2014, Maya, Mexico, poster session

### Gallo Zinfandel

### do and [mostly] don’t…

**R**ather than staying in one of the conference hotels, I followed my habit of renting a flat by finding a nice studio in Cancún via airbnb. Fine except for having no internet connection. (The rental description mentioned “Wifi in the lobby”, which I stupidly interpreted as “lobby of the appartment”, but actually meant “lobby of the condominium building”… Could as well have been “lobby of the airport”.) The condo owner sent us a list of “don’t” a few days ago, some of which are just plain funny (or tell of past disasters!):

- don’t drink heavily

- don’t party or make noise

- don’t host visitors, day or night

- don’t bang the front door or leave the balcony door open when opening the front door

- don’t put cans or bottles on top of the glass cooktop

- don’t cook elaborate meals

- don’t try to fit an entire chicken in the oven

- don’t spill oil or wine on the kitchentop

- don’t cut food directly on the kitchentop

- don’t eat or drink while in bed

- avoid frying, curry, and bacon

- shop for groceries only one day at a time

- hot water may or may not be available

- elevator may or may not be available

- don’t bring sand back in the condo

Filed under: pictures, Travel Tagged: beach, Cancún, condo, flat, Mexico, rental

### Cancun, ISBA 2014 [½ day #2]

**H**alf-day #2 indeed at ISBA 2014, as the Wednesday afternoon kept to the Valencia tradition of free time, and potential cultural excursions, so there were only talks in the morning. And still the core poster session at (late) night. In which my student Kaniav Kamari presented a poster on a current project we are running with Kerrie Mengersen and Judith Rousseau on the replacement of the standard Bayesian testing setting with a mixture representation. Being half-asleep by the time the session started, I did not stay long enough to collect data on the reactions to this proposal, but the paper should be arXived pretty soon. And Kate Lee gave a poster on our importance sampler for evidence approximation in mixtures (soon to be revised!). There was also an interesting poster about reparameterisation towards higher efficiency of MCMC algorithms, intersecting with my long-going interest in the matter, although I cannot find a mention of it in the abstracts. And I had a nice talk with Eduardo Gutierrez-Pena about infering on credible intervals through loss functions. There were also a couple of appealing posters on g-priors. Except I was sleepwalking by the time I spotted them… (My conference sleeping pattern does not work that well for ISBA meetings! Thankfully, both next editions will be in Europe.)

**G**reat talk by Steve McEachern that linked to our ABC work on Bayesian model choice with insufficient statistics, arguing towards robustification of Bayesian inference by only using summary statistics. Despite this being “against the hubris of Bayes”… Obviously, the talk just gave a flavour of Steve’s perspective on that topic and I hope I can read more to see how we agree (or not!) on this notion of using insufficient summaries to conduct inference rather than trying to model “the whole world”, given the mistrust we must preserve about models and likelihoods. And another great talk by Ioanna Manolopoulou on another of my pet topics, capture-recapture, although she phrased it as a partly identified model (as in Kline’s talk yesterday). This related with capture-recapture in that when estimating a capture-recapture model with covariates, sampling and inference are biased as well. I appreciated particularly the use of BART to analyse the bias in the modelling. And the talk provided a nice counterpoint to the rather pessimistic approach of Kline’s.

**T**errific plenary sessions as well, from Wilke’s spatio-temporal models (in the spirit of his superb book with Noel Cressie) to Igor Prunster’s great entry on Gibbs process priors. With the highly significant conclusion that those processes are best suited for (in the sense that they are only consistent for) discrete support distributions. Alternatives are to be used for continuous support distributions, the special case of a Dirichlet prior constituting a sort of unique counter-example. Quite an inspiring talk (even though I had a few micro-naps throughout it!).

**I** shared my afternoon free time between discussing the next O’Bayes meeting (2015 is getting very close!) with friends from the Objective Bayes section, getting a quick look at the Museo Maya de Cancún (terrific building!), and getting some work done (thanks to the lack of wireless…)

Filed under: pictures, Running, Statistics, Travel, University life Tagged: ABC, Bayesian tests, beach, Cancún, g-priors, ISBA 2014, Maya, Mexico, mixture estimation, O-Bayes 2015, posters, sunrise, Valencia conferences

### Cancún, ISBA 2014 [day #1]

**T**he first full day of talks at ISBA 2014, Cancún, was full of goodies, from the three early talks on specifically developed software, including one by Daniel Lee on STAN that completed the one given by Bob Carpenter a few weeks ago in Paris (which gives me the opportunity to advertise STAN tee-shirts!). To the poster session (which just started a wee bit late for my conference sleep pattern!). Sylvia Richardson gave an impressive lecture full of information on Bayesian genomics. I also enjoyed very much two sessions with young Bayesian statisticians, one on Bayesian econometrics and the other one more diverse and sponsored by ISBA. Overall, and this also applies to the programme of the following days, I found that the proportion of non-parametric talks was quite high this year, possibly signalling a switch in the community and the interest of Bayesians. And conversely very few talks on computing related issues. (With most scheduled after my early departure…)

**I**n the first of those sessions, Brendan Kline talked about partially identified parameters, a topic quite close to my interests, although I did not buy the overall modelling adopted in the analysis. For instance, Brendan Kline presented the example of a parameter θ that is the expectation of a random variable Y which is indirectly observed through __x__ <Y< x̅ . While he maintained that inference should be restricted to an interval around θ and that using a prior on θ was doomed to fail (and against econometrics culture), I would have prefered to see this example as a missing data one, with both __x__ and x̅ containing information about θ. And somewhat object to the argument against the prior as it would equally apply to any prior modelling. Although unrelated in the themes, Angela Bitto presented a work on the impact of different prior modellings on the estimation of time-varying parameters in time-series models. À la Harrison and West 1994 Discriminating between good and poor shrinkage in a way I could not spot. Unless it was based on the data fit (horror!). And a third talk of interest by Andriy Norets that (very loosely) related to Angela’s talk by presenting a framework to modify credible sets towards frequentist properties: one example was the credible interval on a positive normal mean that led to a frequency-valid confidence interval with a modified prior. This reminded me very much of the shrinkage confidence intervals of the James-Stein era.

Filed under: pictures, Statistics, Travel, University life Tagged: Bayesian statistics, Cancún, econometrics, genomics, ISBA 2004, Mexico, poster, shrinkage estimation

### ABC in Sydney [guest post]

*[Scott Sisson sent me this summary of the ABC in Sydney meeting that took place two weeks ago.]*

**F**ollowing on from ABC in Paris (2009), ABC in London (2011) and ABC in Rome (2013), the fourth instalment of the international workshops in Approximate Bayesian Computation (ABC) was held at UNSW in Sydney on 3rd-4th July 2014. The first antipodean workshop was held as a satellite to the huge (>550 registrations) IMS-ASC-2014 International Conference, also held in Sydney the following week.

ABC in Sydney was created in two parts. The first, on the Thursday, was held as an “introduction to ABC” for people who were interested to find out more about the subject, but who had not particularly been exposed to the area before. Rather than have a single brave individual give the introductory course over several hours, the expository presentation was “crowdsourced” from several experienced researchers in the field, with each being given 30 minutes to present on a particular aspect of ABC. In this way, Matthew Moores (QUT), Dennis Prangle (Reading), Chris Drovandi (QUT), Zach Aandahl (UNSW) and Scott Sisson (UNSW) covered the ABC basics over the course of 6 presentations and 3 hours.

**T**he second part of the workshop, on Friday, was the more usual collection of research oriented talks. In the morning session, Dennis Prangle spoke about “lazy ABC,” a method of stopping the generation of computationally demanding dataset simulations early, and Chris Drovandi discussed theoretical and practical aspects of Bayesian indirect inference. This was followed by Brenda Nho Vo (QUT) presenting an application of ABC in stochastic cell spreading models, and by Pierre Del Moral (UNSW) who demonstrated many theoretical aspects of ABC in interacting particle systems. After lunch Guilherme Rodrigues (UNSW) proposed using ABC for Gaussian process density estimation (and introduced the infinite-dimensional functional regression adjustment), and Gael Martin (Monash) spoke on the issues involved in applying ABC to state space models. The final talk of the day was given by Matthew Moores who discussed how online ABC dataset generation could be circumvented by pre-computation for particular classes of models.

**I**n all, over 100 people registered for and attended the workshop, making it an outstanding success. Of course, this was helped by the association with the following large conference, and the pricing scheme — completely free! — following the tradition of the previous workshops. Morning and afternoon teas, described as “the best workshop food ever!” by several attendees, was paid for by the workshop sponsors: the Bayesian Section of the Statistical Society of Australia, and the ARC Centre of Excellence in Mathematical and Statistical Frontiers.

**H**ere’s looking forward to the next workshop in the series!

Filed under: pictures, Statistics, University life Tagged: abc-in-sydney, Australia, Scott Sisson, Sydney

### Cancún, ISBA 2014 [day #0]

**D**ay zero at ISBA 2014! The relentless heat outside (making running an ordeal, even at 5:30am…) made the (air-conditioned) conference centre the more attractive. Jean-Michel Marin and I had a great morning teaching our ABC short course and we do hope the ABC class audience had one as well. Teaching in pair is much more enjoyable than single as we can interact with one another as well as the audience. And realising unsuspected difficulties with the material is much easier this way, as the (mostly) passive instructor can spot the class’ reactions. This reminded me of the course we taught together in Oulu, northern Finland, in 2004 and that ended as the Bayesian Core. We did not cover the entire material we have prepared for this short course, but I think the pace was the right one. (Just tell me otherwise if you were there!) This was also the only time I had given a course wearing sunglasses, thanks to yesterday’s incident!

**W**aiting for a Spanish speaking friend to kindly drive with me downtown Cancún to check whether or not an optician could make me new prescription glasses, I attended Jim Berger’s foundational lecture on frequentist properties of Bayesian procedures but could only listen as the slides were impossible for me to read, with or without glasses. The partial overlap with the Varanasi lecture helped. I alas had to skip both Gareth Roberts’ and Sylvia Früwirth-Schnatter’s lectures, apologies to both of them!, but the reward was to get a new pair of prescription glasses within a few hours. Perfectly suited to my vision! And to get back just in time to read slides during Peter Müller’s lecture from the back row! Thanks to my friend Sophie for her negotiating skills! Actually, I am still amazed at getting glasses that quickly, given the time it would have taken in, e.g., France. All set for another 15 years with the same pair?! Only if I do not go swimming with them in anything but a quiet swimming pool!

**T**he starting dinner happened to coincide with the (second) ISBA Fellow Award ceremony. Jim acted as the grand master of ceremony and he did great to add life and side stories to the written nominations for each and everyone of the new Fellows. The Fellowships honoured Bayesian statisticians who had contributed to the field as researchers and to the society since its creation. I thus feel very honoured (and absolutely undeserving) to be included in this prestigious list, along with many friends. (But would have loved to see two more former ISBA presidents included, esp. for their massive contribution to Bayesian theory and methodology…) And also glad to wear regular glasses instead of my morning sunglasses.

*[My Internet connection during the meeting being abysmally poor, the posts will appear with some major delay! In particular, I cannot include new pictures at times I get a connection... Hence a picture of northern Finland instead of Cancún at the top of this post!]*

Filed under: Statistics, Travel, University life Tagged: ABC, Cancún, Caribean sea, ISBA, Jim Berger, Mexico, short course, sunglasses, Valencia conferences

### another R new trick [new for me!]

**W**hile working with Andrew and a student from Dauphine on importance sampling, we wanted to assess the distribution of the resulting sample via the Kolmogorov-Smirnov measure

where F is the target. This distance (times √n) has an asymptotic distribution that does not depend on n, called the Kolmogorov distribution. After searching for a little while, we could not figure where this distribution was available in R. It had to, since ks.test was returning a p-value. Hopefully correct! So I looked into the ks.test function, which happens not to be entirely programmed in C, and found the line

PVAL <- 1 - if (alternative == "two.sided") .Call(C_pKolmogorov2x, STATISTIC, n)which means that the Kolmogorov distribution is coded as a C function C_pKolmogorov2x in R. However, I could not call the function myself.

> .Call(C_pKolmogorov2x,.3,4) Error: object 'C_pKolmogorov2x' not foundHence, as I did not want to recode this distribution cdf, I posted the question on stackoverflow (long time no see!) and got a reply almost immediately as to use the package kolmim. Followed by the extra comment from the same person that calling the C code only required to add the path to its name, as in

> .Call(stats:::C_pKolmogorov2x,STAT=.3,n=4) [1] 0.2292Filed under: Books, Kids, R, Statistics, University life Tagged: C code, importance sampling, Introducing Monte Carlo Methods with R, kolmim, Kolmogorov-Smirnov distance, R, stackoverflow, Université Paris Dauphine

### Cancun sunrise

### implementing reproducible research [short book review]

**A**s promised, I got back to this book, *Implementing reproducible research* (after the pigeons had their say). I looked at it this morning while monitoring my students taking their last-chance R exam (definitely *last* chance as my undergraduate R course is not reconoduced next year). The book is in fact an edited collection of papers on tools, principles, and platforms around the theme of *reproducible research*. It obviously links with other themes like open access, open data, and open software. All positive directions that need more active support from the scientific community. In particular the solutions advocated through this volume are mostly Linux-based. Among the tools described in the first chapter, knitr appears as an alternative to sweave. I used the later a while ago and while I like its philosophy. it does not extend to situations where the R code within takes too long to run… (Or maybe I did not invest enough time to grasp the entire spectrum of sweave.) Note that, even though the book is part of the R Series of CRC Press, many chapters are unrelated to R. And even more [unrelated] to statistics.

**T**his limitation is somewhat my difficulty with [adhering to] the global message proposed by the book. It is great to construct such tools that monitor and archive successive versions of code and research, as anyone can trace back the research steps conducting to the published result(s). Using some of the platforms covered by the book establishes for instance a superb documentation principle, going much further than just providing an “easy” verification tool against fraudulent experiments. The notion of a super-wiki where notes and preliminary versions and calculations (and dead ends and failures) would be preserved for open access is just as great. However this type of research processing and discipline takes time and space and human investment, i.e. resources that are sparse and costly. Complex studies may involve enormous amounts of data and, neglecting the notions of confidentiality and privacy, the cost of storing such amounts is significant. Similarly for experiments that require days and weeks of huge clusters. I thus wonder where those resources would be found (journals, universities, high tech companies, …?) for the principle to hold in full generality and how transient they could prove. One cannot expect the research time to garantee availability of those meta-documents for remote time horizons. Just as a biased illustration, checking the available Bayes’ notebooks meant going to a remote part of London at a specific time and with a preliminary appointment. Those notebooks are not available on line for free. But for how long?

*“So far, Bob has been using Charlie’s old computer, using Ubuntu 10.04. The next day, he is excited to find the new computer Alice has ordered for him has arrived. He installs Ubuntu 12.04″* A. Davison et al.

**P**utting their principles into practice, the authors of *Implementing reproducible research* have made all chapters available for free on the Open Science Framework. I thus encourage anyone interesting in those principles (and who would not be?!) to peruse the chapters and see how they can benefit from and contribute to open and reproducible research.

Filed under: Books, Kids, pictures, R, Statistics, Travel, University life Tagged: Bayes' notebooks, book review, CHANCE, knitr, Linux, pigeon, R, R exam, reproducible research, sweave, Ubuntu 12.04, Université Paris Dauphine

### arrived in Cancún

**A**fter an uneventful trip from Paris, we landed to the heat and humidity just a day before our ABC course. Much too hot and too humid for my taste, so I am looking forward spending my days in the conference centre. Hopefully, it will get cool enough to go running in the early morning…

Most unfortunately, when trying to get a taste of the water last night, I almost immediately lost my prescription glasses to a big wave and am forced to move around the conference wearing sunglasses…. or looking lost and not recognizing anyone! What a bummer!

Filed under: Kids, pictures, Statistics, Travel, University life Tagged: Cancún, ISBA 2014, Mexico, sea, sunglasses

### adaptive equi-energy sampling

**T**oday, I took part in the thesis defence of Amandine Shreck at Telecom-ParisTech. I had commented a while ago on the Langevin algorithm for discontinuous targets she developed with co-authors from that school towards variable selection. The thesis also contains material on the equi-energy sampler that is worth mentioning. The algorithm relates to the Wang-Landau algorithm last discussed here for the seminars of Pierre and Luke in Paris, last month. The algorithm aims at facilitating the moves around the target density by favouring moves from one energy level to the next. As explained to me by Pierre once again after his seminar, the division of the space according to the target values is a way to avoid creating artificial partitions over the sampling space. A sort of Lebesgue version of Monte Carlo integration. The energy bands

require the choice of a sequence of bounds on the density, values that are hardly available prior to the simulation of the target. The paper corresponding to this part of the thesis (and published in our special issue of TOMACS last year) thus considers the extension when the bounds are defined on the go, in a adaptive way. This could be achieved based on earlier simulations, using some quantiles of the observed values of the target but this is a costly solution which requires to keep an ordered sample of the density values. (Is it that costly?!) Thus the authors prefer to determine the energy levels in a cheaper adaptive manner. Namely, through a Robbins-Monro/stochastic approximation type update of the bounds,

**M**y questions related with this part of the thesis were about the actual gain if any in computing time versus efficiency, the limitations in terms of curse of dimension and storage, the connections with the Wang-Landau algorithm and pseudo-marginal approximations, and the (degree of) likelihood of an universal and automatised adaptive equi-energy sampler.

Filed under: Statistics, University life Tagged: MCMC, Metropolis-Hastings algorithm. equi-energy sampler, Telecom ParisTech

### no ISBA 2016 in Banff…

**A**las, thrice alas, the bid we made right after the Banff workshop with Scott Schmidler, and Steve Scott for holding the next World ISBA Conference in 2016 in Banff, Canada was unsuccessful. This is a sad and unforeseen item of news as we thought Banff had a heap of enticing features as a dream location for the next meeting… Although I cannot reveal the location of the winner, I can mention it is much more traditional (in the sense of the Valencia meetings), i.e. much more mare than monti… Since it is in addition organised by friends and in a country I love, I do not feel particularly aggravated. Especially when considering we will not have to organise *anything* then!

Filed under: Mountains, pictures, Statistics, Travel, University life Tagged: Alberta, Arthur's Seat, Banff, Banff Centre, Canada, Canadian Rockies, Edinburgh, ISBA, ISBA 2016, ISBA 2018, Mount Rundle, Mount Temple, Scotland

### impressions, soleil couchant (#2)

Filed under: pictures, Travel Tagged: Charles de Gaulle, Paris suburbs, RER B, Roissy, summer, sunset, train, University of Warwick

### Le Monde puzzle [#875]

**I** learned something in R today thanks to Le Monde mathematical puzzle:

*A two-player game consists in A picking a number n between 1 and 10 and B and A successively choosing and applying one of three transforms to the current value of n
*

*n=n+1,**n=3n,**n=4n,*

* starting with B, until n is larger than 999. Which value(s) of n should A pick if both players act optimally?*

**I**ndeed, I first tested the following R code

which did not work because of too many calls to sole:

>sole(1) Error: evaluation nested too deeply: infinite recursion / options(expressions=)?So I included a memory in the calls to sole so that good and bad entries of n were saved for later calls:

visit=rep(-1,1000) #not yet visited sole=function(n){ if (n>999){ return(TRUE) }else{ if (visit[n]>-1){ return(visit[n]==1) }else{ visit[n]<<-((!sole(3*n))&(!sole(4*n))& (!sole(n+1))) return(visit[n]==1) }}}**T**rying frontal attack

did not work, but one single intermediary was sufficient:

> sole(500) [1] FALSE > for (i in 1:10) + print(sole(i)) [1] FALSE [1] FALSE [1] FALSE [1] TRUE [1] FALSE [1] TRUE [1] FALSE [1] FALSE [1] FALSE [1] FALSEwhich means that the only winning starters for A are n=4,6. If one wants the winning moves on top, a second counter can be introduced:

visit=best=rep(-1,1000) sole=function(n){ if (n>999){ return(TRUE) }else{ if (visit[n]>-1){ return(visit[n]==1) }else{ visit[n]<<-((!sole(3*n))&(!sole(4*n))& (!sole(n+1))) if (visit[n]==0) best[n]<<-max( 3*n*(sole(3*n)), 4*n*(sole(4*n)), (n+1)*(sole(n+1))) return(visit[n]==1) }}}From which we can deduce the next values chosen by A or B as

> best[1:10] [1] 4 6 4 -1 6 -1 28 32 36 40(where -1 means no winning choice is possible).

**N**ow, what is the R trick I learned from this toy problem? Simply the use of the double allocation symbol that allows to change global variables within functions. As visit and best in the latest function. (The function assign would have worked too.)

Filed under: Kids, R, Statistics, University life Tagged: assign(), global variable, Le Monde, local variable, mathematical puzzle

### ABC in Cancún

**H**ere are our slides for the ABC [very] short course Jean-Michel and I give at ISBA 2014 in Cancún next Monday (if your browser can manage Slideshare…) Although I may switch the pictures from Iceland to Mexico, on Sunday, there will be not much change on those slides we both have previously used in previous short courses. (With a few extra slides borrowed from Richard Wilkinson’s tutorial at NIPS 2013!) Jean-Michel will focus his share of the course on software implementations, from R packages like abc and abctools and our population genetics software DIYABC. With an illustration on SNPs data from pygmies populations.

Filed under: Books, Kids, R, Statistics, Travel, University life Tagged: ABC, abc package, abctools package, Bayesian methodology, Cancún, DIYABC, ISBA 2014, Mexico, pygmies

### Bayes’ Rule [book review]

**T**his introduction to Bayesian Analysis, Bayes’ Rule, was written by James Stone from the University of Sheffield, who contacted CHANCE suggesting a review of his book. I thus bought it from amazon to check the contents. And write a review.

**F**irst, the format of the book. It is a short paper of 127 pages, plus 40 pages of glossary, appendices, references and index. I eventually found the name of the publisher, Sebtel Press, but for a while thought the book was self-produced. While the LaTeX output is fine and the (Matlab) graphs readable, pictures are not of the best quality and the display editing is minimal in that there are several huge white spaces between pages. Nothing major there, obviously, it simply makes the book look like course notes, but this is in no way detrimental to its potential appeal. (I will not comment on the numerous appearances of Bayes’ alleged portrait in the book.)

*“… (on average) the adjusted value θMAP is more accurate than θMLE.” (p.82)*

Bayes’ Rule has the interesting feature that, in the very first chapter, after spending a rather long time on Bayes’ formula, it introduces Bayes factors (p.15). With the somewhat confusing choice of calling the *prior* probabilities of hypotheses *marginal* probabilities. Even though they are indeed *marginal* given the joint, *marginal* is usually reserved for the sample, as in *marginal* likelihood. Before returning to more (binary) applications of Bayes’ formula for the rest of the chapter. The second chapter is about probability theory, which means here introducing the three axioms of probability and discussing geometric interpretations of those axioms and Bayes’ rule. Chapter 3 moves to the case of discrete random variables with more than two values, i.e. contingency tables, on which the range of probability distributions is (re-)defined and produces a new entry to Bayes’ rule. And to the MAP. Given this pattern, it is not surprising that Chapter 4 does the same for continuous parameters. The parameter of a coin flip. This allows for discussion of uniform and reference priors. Including maximum entropy priors à la Jaynes. And bootstrap samples presented as approximating the posterior distribution under the “fairest prior”. And even two pages on standard loss functions. This chapter is followed by a short chapter dedicated to estimating a normal mean, then another short one on exploring the notion of a continuous joint (Gaussian) density.

*“To some people the word *Bayesian* is like a red rag to a bull.” (p.119)*

Bayes’ Rule concludes with a chapter entitled *Bayesian wars*. A rather surprising choice, given the intended audience. Which is rather bound to confuse this audience… The first part is about probabilistic ways of representing information, leading to subjective probability. The discussion goes on for a few pages to justify the use of priors but I find completely unfair the argument that because Bayes’ rule is a mathematical theorem, it “has been proven to be true”. It is indeed a maths theorem, however that does not imply that any inference based on this theorem is correct! (A surprising parallel is Kadane’s Principles of Uncertainty with its anti-objective final chapter.)

**A**ll in all, I remain puzzled after reading Bayes’ Rule. Puzzled by the intended audience, as contrary to other books I recently reviewed, the author does not shy away from mathematical notations and concepts, even though he proceeds quite gently through the basics of probability. Therefore, potential readers need some modicum of mathematical background that some students may miss (although it actually corresponds to what my kids would have learned in high school). It could thus constitute a soft entry to Bayesian concepts, before taking a formal course on Bayesian analysis. Hence doing no harm to the perception of the field.

Filed under: Books, Statistics, University life Tagged: Amazon, Bayes formula, Bayes rule, Bayes theorem, Bayesian Analysis, England, introductory textbooks, publishing, short course, Thomas Bayes' portrait, tutorial