Bayesian News Feeds
A very exciting talk today at NBBC15 here in Reykjavik was delivered by Mark Bravington yesterday on Close-kin mark recapture by modern magic (!). Although Mark is from Australia, being a Hobart resident does qualify him for the Nordic branch of the conference! The exciting idea is to use genetic markers to link catches in a (fish) population as being related as parent-offspring or as siblings. This sounds like science-fantasy when you first hear of it!, but it is actually working better than standard capture-mark-recapture methods for populations of a certain size (so that the chances to find related animals are not the absolute zero!, as, e.g., krill populations). The talk was focussed on bluefin tuna, whose survival is unlikely under the current fishing pressure… Among the advantages, a much more limited impact of the capture on the animal, since only a small amount of genetic material is needed, no tag loss, tag destruction by hunters, or tag impact of the animal survival, no recapture, a unique identification of each animal, and the potential for a detailed amount of information through the genetic record. Ideally, the entire sample could lead to a reconstruction of its genealogy all the way to the common ancestor, a wee bit like what 23andme proposes for humans, but this remains at the science-fantasy level given what is currently know about the fish species genomes.
Filed under: Mountains, pictures, Statistics, Travel, University life Tagged: Australia, bluefin tuna, capture-recapture, genotyping, Hobart, Iceland, NBBC15 conference, Reykjavik, Tasmania
Filed under: Mountains, pictures, Travel Tagged: Arctic, hiking, Iceland noir, Kleifarvatn, lake, NBBC15 conference, Reykjanes Peninsula, summer
“We deliver a call to arms for probabilistic numerical methods: algorithms for numerical tasks, including linear algebra, integration, optimization and solving differential equations, that return uncertainties in their calculations.” (p.1)
Philipp Hennig, Michael Osborne and Mark Girolami (Warwick) posted on arXiv a paper to appear in Proceedings A of the Royal Statistical Society that relates to the probabilistic numerics workshop they organised in Warwick with Chris Oates two months ago. The paper is both a survey and a tribune about the related questions the authors find of most interest. The overall perspective is proceeding along Persi Diaconis’ call for a principled Bayesian approach to numerical problems. One interesting argument made from the start of the paper is that numerical methods can be seen as inferential rules, in that a numerical approximation of a deterministic quantity like an integral can be interpreted as an estimate, even as a Bayes estimate if a prior is used on the space of integrals. I am always uncertain about this perspective, as for instance illustrated in the post about the missing constant in Larry Wasserman’s paradox. The approximation may look formally the same as an estimate, but there is a design aspect that is almost always attached to numerical approximations and rarely analysed as such. Not mentioning the somewhat philosophical issue that the integral itself is a constant with no uncertainty (while a statistical model should always entertain the notion that a model can be mis-specified). The distinction explains why there is a zero variance importance sampling estimator, while there is no uniformly zero variance estimator in most parametric models. At a possibly deeper level, the debate that still invades the use of Bayesian inference to solve statistical problems would most likely resurface in numerics, in that the significance of a probability statement surrounding a mathematical quantity can only be epistemic and relate to the knowledge (or lack thereof) about this quantity rather than to the quantity itself.
“(…) formulating quadrature as probabilistic regression precisely captures a trade-off between prior assumptions inherent in a computation and the computational effort required in that computation to achieve a certain precision. Computational rules arising from a strongly constrained hypothesis class can perform much better than less restrictive rules if the prior assumptions are valid.” (p.7)
Another general worry [repeating myself] about setting a prior in those functional spaces is that the posterior may then mostly reflect the choice of the prior rather than the information contained in the “data”. The above quote mentions prior assumptions that seem hard to build from prior opinion about the functional of interest. And even less about the function itself. Coming back from a gathering of “objective Bayesians“, it seems equally hard to agree upon a reference prior. However, since I like the alternative notion of using decision theory in conjunction with probabilistic numerics, it seems hard to object to the use of priors, given the “invariance” of prior x loss… But I would like to understand better how it is possible to check for prior assumption (p.7) without using the data. Or maybe it does not matter so much in this setting? Unlikely, as indicated in the remarks about the bias resulting from the active design (p.13).
A last issue I find related to the exploratory side of the paper is the “big world versus small worlds” debate, namely whether we can use the Bayesian approach to solve a sequence of small problems rather than trying to solve the big problem all at once. Which forces us to model the entirety of unknowns. And almost certainly fail. (This is was the point of the Robbins-Wasserman counterexample.) Adopting a sequence of solutions may be construed as incoherent in that the prior distribution is adapted to the problem rather than encompassing all problems. Although this would not shock the proponents of reference priors.
Filed under: Books, pictures, Statistics, University life Tagged: estimating a constant, objective Bayes, Persi Diaconis, prior assessment, probabilistic numerics, quadrature rule, Runge-Kutta
Tonight, we [participants to the NBBC15 conference] got invited [and bused] to deCODE, the Icelandic genetic company that has worked on the human genome since 1996, taking advantage of the uniquely homogeneous features of the Icelandic population. Which overwhelmingly descends from the few originals settlers to populate Iceland in the late 800’s. deCODE is located in downtown Reyjavik next to the regional airport and to Hallgrímskirkja, the iconic church overlooking the city. The genetic company has gathered genotypic and phenotypic information about half the population of Iceland and, thanks to extensive genealogical sources, has also put together the Íslendingabók that covers the entire current population and runs back to the origins of the country. Despite being a company (and now a subsidiary of Amgen), deCODE appears to operate just like another research institution, searching for genetic explanations of diseases and genotyping more and more individuals towards that goal, with a startup atmosphere in a well-designed building… A most unusual and enjoyable evening at a conference! Making me wonder if they have visiting positions…
Filed under: Books, Kids, pictures, Statistics, Travel, University life Tagged: Íslendingabók, Íslensk erfðagreining, deCODE, Iceland, Reykjavik
During the Valencia O’Bayes 2015 meeting, Colin LaMont and Paul Wiggins arxived a paper entitled “An objective prior that unifies objective Bayes and information-based inference”. It would have been interesting to have the authors in Valencia, as they make bold claims about their w-prior as being uniformly and maximally uninformative. Plus achieving this unification advertised in the title of the paper. Meaning that the free energy (log transform of the inverse evidence) is the Akaike information criterion.
The paper starts by defining a true prior distribution (presumably in analogy with the true value of the parameter?) and generalised posterior distributions as associated with any arbitrary prior. (Some notations are imprecise, check (3) with the wrong denominator or the predictivity that is supposed to cover N new observations on p.2…) It then introduces a discretisation by considering all models within a certain Kullback divergence δ to be undistinguishable. (A definition that does not account for the assymmetry of the Kullback divergence.) From there, it most surprisingly [given the above discretisation] derives a density on the whole parameter space
where N is the number of observations and K the dimension of θ. Dimension which may vary. The dependence on N of the above is a result of using the predictive on N points instead of one. The w-prior is however defined differently: “as the density of indistinguishable models such that the multiplicity is unity for all true models”. Where the log transform of the multiplicity is the expected log marginal likelihood minus the expected log predictive [all expectations under the sampling distributions, conditional on θ]. Rather puzzling in that it involves the “true” value of the parameter—another notational imprecision, since it has to hold for all θ’s—as well as possibly improper priors. When the prior is improper, the log-multiplicity is a difference of two terms such that the first term depends on the constant used with the improper prior, while the second one does not… Unless the multiplicity constraint also determines the normalising constant?! But this does not seem to be the case when considering the following section on normalising the w-prior. Mentioning a “cutoff” for the integration that seems to pop out of nowhere. Curiouser and curiouser. Due to this unclear handling of infinite mass priors, and since the claimed properties of uniform and maximal uninformativeness are not established in any formal way, and since the existence of a non-asymptotic solution to the multiplicity equation is neither demonstrated, I quickly lost interest in the paper. Which does not contain any worked out example. Read at your own risk!
Filed under: Books, pictures, Statistics, Travel, University life Tagged: AIC, Akaike's criterion, free energy, information criterion, objective Bayes, Spain, Valencia conferences, w-prior
Today, as I had a free day (with 24 hour daylight!) in Reykjavik before the NBBC15 conference started, thanks to the crazy schedules of the low cost sister of Air France, Transavia (!), I went in search of a hike… Which is not very difficult in Iceland! I had originally planned to stop near Geysir as the dirt road beyond Gullfoss is off-limit for rental cars. Especially small 2WD like mine.
As I was driving the first kms of the Þingvellir road, I admired the Esjan range starting with the Esja mountain that we had climbed during our previous visit to Iceland. Especially the “last” peak that glowed with a warm yellow (and apparently no snow at all). More especially, because it had a top reminding me of the Old Man of Storr on its slope. (Not that I could spot it while driving!) And quickly decided this was a great opportunity for a nice hike and a minimum of driving as I was about 20 mn from down-town Reykjavik.
I thus took a dirt road that seemed to get closer to my goal and after 500m came to a farm yard where I parked the car and went hiking, aiming at this peak, which name is Móskarðshnjúkar. Despite a big cut due to a torrent after the first hill, I managed to keep enough to high ground not to loose any altitude and sticking to the side of the ski station Skálafell (where a few people were still skiing with the noisy help of two snowmobiles), I crossed the brook easily as it was covered by snow and started moving to steeper if manageable slopes. I reached the bottom of the main peak rather quickly and then understood both its colour and the absence of snow.
As maybe visible from some of my pictures (?), the Móskarðshnjúkar peak is covered with gravel in a bright yellow stone that seems to accumulate heat very well. Climbing straight on the loose gravel was then impossible and I had to zigzag mostly up, trying to not lose too much ground to micro-avalanches. As I reached the tor I spotted two hikers above me and when I reached the top I realised there was a path coming from the west, connecting this peak with its neighbours. The normal route seems to come from a gravel road that starts close to Mount Esja, to the west, and as I followed the path down to the saddle between Móskarðshnjúkar and the rest of the range, I saw this path winding down to the valley with further hikers coming up. Before I crossed them, I went up again to the next peak, which was an easy if beautiful ridge walk, with still a fair amount of snow remaining on the north face (heavy enough to bear tracks of snowmobiles!). After following the ridge track for a while, it branched north to reach the main Esja plateau and I left the track to get down a rocky shoulder towards my starting point. However, I had forgotten about the torrent cut between the two ranges and this forced me to take a further detour. And to cross the torrent barefooted, as there was no stone ford on this off-path section. No big drama as the melted snow water was not that cold…
A last sight was provided by the final rocky outcrop, which enjoyed basaltic volcanic columns as on the picture above. A terrific hiking half-day with a sharp sunny weather and not too much wind except at the top. It was very pleasant to walk part of the way on moss and last year grass, with a surprising absence of bogs and mud when compared with Scotland.
Filed under: Mountains, pictures, Running, Travel Tagged: Þingvellir, Esja, hiking, Iceland, Móskarðshnjúkar peak, Reykjavik, Scotland, volcanoes
Filed under: Mountains, pictures, Travel, University life Tagged: full moon, Iceland, midnight sun, Nordic-Baltic Biometric conference, Reykjanes Peninsula, Reykjavik
During our short trip to Tuscany last week, we came across a festival in the small fortified village of Montefioralle, frazione of the town of Greve in Chianti, at the centre of the Chianti region. Although the festival lasted for two days with all sorts of activities, we arrived on the evening, with the medieval groups folding their costumes and swords, and only a few bagpipe players remaining in the small street circling the village inside the first and protective row of houses.However, the twelve local wine producers who lined that street had not closed their stall and we were thus able to taste (if not to drink!, as I was driving…) their Chianti wines and discuss about their production and methods in English, French, and even survival Italian! Which was fun as the producers were not pushy at all (well, most of them!), but happy to extol the virtues of their wine.While I am not particularly fond of Chianti [among Italian wines], I appreciated a few of them during that wine tasting circuit, especially the three organic Podere Campriano, including one growing on a reclaimed forest with a highly vegetal taste.
Filed under: Mountains, pictures, Travel, Wines Tagged: Castello di Montefioralle, Chianti, Greve in Chianti, Italian wines, Podere Campriano, wine tasting
Filed under: Kids, pictures, Running, Travel Tagged: Spain, statue, sunrise, trail running, València
The third day of the meeting was a good illustration of the diversity of the themes [says a member of the scientific committee!], from “traditional” O’Bayes talks on reference priors by the father of all reference priors (!), José Bernardo, re-examinations of expected posterior priors, on properties of Bayes factors, or on new versions of the Lindley-Jeffreys paradox, to the radically different approach of Simpson et al. presented by Håvard Rue. I was obviously most interested in posterior expected priors!, with the new notion brought in by Dimitris Fouskakis, Ioannis Ntzoufras and David Draper of a lower impact of the minimal sample on the resulting prior by the trick of a lower (than one) power of the likelihood. Since this change seemed to go beyond the “minimal” in minimal sample size, I am somehow puzzled that this can be achieved, but the normal example shows this is indeed possible. The next difficulty is then in calibrating this power as I do not see any intuitive justification in a specific power. The central talk of the day was in my opinion Håvard’s as it challenged most tenets of the Objective Bayes approach, presented in a most eager tone, even though it did not generate particularly heated comments from the audience. I have already discussed here an earlier version of this paper and I keep on thinking this proposal for PC priors is a major breakthrough in the way we envision priors and their derivation. I was thus sorry to hear the paper had not been selected as a Read Paper by the Royal Statistical Society, as it would have nicely suited an open discussion, but I hope it will find another outlet that allows for a discussion! As an aside, Håvard discussed the case of a Student’s t degree of freedom as particularly challenging for prior construction, albeit I would have analysed the problem using instead a model choice perspective (on an usually continuous space of models).
As this conference day had a free evening, I took the tram with friends to the town beach and we had a fantastic [if hurried] dinner in a small bodega [away from the uninspiring beach front] called Casa Montaña, a place decorated with huge barrels, offering amazing tapas and wines, a perfect finale to my Spanish trip. Too bad we had to vacate the dinner room for the next batch of customers…
Filed under: Statistics, Travel, University life, Wines Tagged: Bayes factors, O'Bayes 2015, Power-Expected-Posterior Priors, prior assessment, reference priors, Spanish wines, València, variable selection, Virgulilla
Filed under: pictures, Statistics, Travel, University life, Wines Tagged: clouds, roofs, Spain, sunset, València
This morning was the most special time of the conference in that we celebrated Susie Bayarri‘s contributions and life together with members of her family. Jim gave a great introduction that went over Susie’s numerous papers and the impact they had in Statistics and outside Statistics. As well as her recognised (and unsurprising if you knew her) expertise in wine and food! The three talks in that morning were covering some of the domains within Susie’s fundamental contributions and delivered by former students of her: model assessment through various types of predictive p-values by Maria Eugenia Castellanos, Bayesian model selection by Anabel Forte, and computer models by Rui Paulo, all talks that translated quite accurately the extent of Susie’s contributions… In a very nice initiative, the organisers had also set a wine tasting break (at 10am!) around two vintages that Susie had reviewed in the past years [with reviews to show up soon in the Wines section of the ‘Og!]
The talks of the afternoon session were by Jean-Bernard (JB) Salomond about a new proposal to handle embedded hypotheses in a non-parametric framework and by James Scott about false discovery rates for neuroimaging. Despite the severe theoretical framework behind the proposal, JB managed a superb presentation that mostly focussed on the intuition for using the smoothed (or approximative) version of the null hypothesis. (A flavour of ABC, somehow?!) Also kudos to JB for perpetuating my tradition of starting sections with unrelated pictures. James’ topic was more practical Bayes or pragmatic Bayes than objective Bayes in that he analysed a large fMRI experiment on spatial working memory, introducing a spatial pattern that led to a complex penalised Lasso-like optimisation. The data was actually an fMRI of the brain of Russell Poldrack, one of James’ coauthors on that paper.
The (sole) poster session was on the evening with a diverse range of exciting topics—including three where I was a co-author, by Clara Grazian, Kaniav Kamary, and Kerrie Mengersen—but it was alas too short or I was alas too slow to complete the tour before it ended! In retrospect we could have broken it into two sessions since Wednesday evening is a free evening.
Filed under: pictures, Running, Statistics, Travel, University life, Wines Tagged: ABC, computer experiment model, embedded models, false discovery rate, Susie Bayarri, València, Verema
I take the opportunity of this abc picture taken in Apulia, by friends from Milano, to make a late call for the next European “ABC in”! After the Paris, London, and Roma occurrences, there are still heaps of European cities to hold a one- or two-day workshop on the latest developments on approximate Bayesian computing. While it seems a wee bit late to have the workshop in 2015, any suggestion is welcome. For instance as a satellite to a larger meeting like MCMski V in Lenzerheide next January. Or the French statistical meeting in Montpellier.
Filed under: pictures, Statistics, Travel, University life
So here we are back together to talk about objective Bayes methods, and in the City of Valencià as well.! A move back to a city where the 1998 O’Bayes took place. In contrast with my introductory tutorial, the morning tutorials by Luis Pericchi and Judith Rousseau were investigating fairly technical and advanced, Judith looking at the tools used in the frequentist (Bernstein-von Mises) analysis of priors, with forays in empirical Bayes, giving insights into a wide range of recent papers in the field. And Luis covering works on Bayesian robustness in the sense of resisting to over-influential observations. Following works of him and of Tony O’Hagan and coauthors. Which means characterising tails of prior versus sampling distribution to allow for the posterior reverting to the prior in case of over-influential datapoints. Funny enough, after a great opening by Carmen and Ed remembering Susie, Chris Holmes also covered Bayesian robust analysis. More in the sense of incompletely or mis- specified models. (On the side, rekindling one comment by Susie and the need to embed robust Bayesian analysis within decision theory.) Which was also much Chris’ point, in line with the recent Watson and Holmes’ paper. Dan Simpson in his usual kick-the-anthill-real-hard-and-set-fire-to-it discussion pointed out the possible discrepancy between objective and robust Bayesian analysis. (With lines like “modern statistics has proven disruptive to objective Bayes”.) Which is not that obvious because the robust approach simply reincorporates the decision theory within the objective framework. (Dan also concluded with the comic strip below, whose message can be interpreted in many ways…! Or not.)
The second talk of the afternoon was given by Veronika Ročková on a novel type of spike-and-slab prior to handle sparse regression, bringing an alternative to the standard Lasso. The prior is a mixture of two Laplace priors whose scales are constrained in connection with the actual number of non-zero coefficients. I had not heard of this approach before (although Veronika and Ed have an earlier paper on a spike-and-slab prior to handle multicolinearity that Veronika presented in Boston last year) and I was quite impressed by the combination of minimax properties and practical determination of the scales. As well as by the performances of this spike-and-slab Lasso. I am looking forward the incoming paper!
The day ended most nicely in the botanical gardens of the University of Valencià, with an outdoor reception surrounded by palm trees and parakeet cries…
Filed under: Books, pictures, Running, Statistics, Travel, University life, Wines Tagged: Bayesian lasso, Bernstein-von Mises theorem, objective Bayes, robustness, Susie Bayarri, Valencia conferences, Valencia meeting
A few of us met (somewhat) early this morning to run together in memory of Susie, wearing the bright red tee-shirts given to us by the O’Bayes 2015 conference organisers. And going along the riverbed that circles the old town of Valencià. Till next run, Susie!
Filed under: Running, Statistics, Travel, University life Tagged: Bayesian statisticians, jogging, Spain, Susie Bayarri, València
“By accepting of having obtained a poor approximation to the posterior, except for the location of its main mode, we switch to maximum likelihood estimation.”
Presumably the first paper ever quoting from the ‘Og! Indeed, Umberto Picchini arXived a paper about a technique merging ABC with prior feedback (rechristened data cloning by S. Lele), where a maximum likelihood estimate is produced by an ABC-MCMC algorithm. For state-space models. This relates to an earlier paper by Fabio Rubio and Adam Johansen (Warwick), who also suggested using ABC to approximate the maximum likelihood estimate. Here, the idea is to use an increasing number of replicates of the latent variables, as in our SAME algorithm, to spike the posterior around the maximum of the (observed) likelihood. An ABC version of this posterior returns a mean value as an approximate maximum likelihood estimate.
“This is a so-called “likelihood-free” approach [Sisson and Fan, 2011], meaning that knowledge of the complete expression for the likelihood function is not required.”
The above remark is sort of inappropriate in that it applies to a non-ABC setting where the latent variables are simulated from the exact marginal distributions, that is, unconditional on the data, and hence their density cancels in the Metropolis-Hastings ratio. This pre-dates ABC by a few years, since this was an early version of particle filter.
“In this work we are explicitly avoiding the most typical usage of ABC, where the posterior is conditional on summary statistics of data S(y), rather than y.”
Another point I find rather negative in that, for state-space models, using the entire time-series as a “summary statistic” is unlikely to produce a good approximation.
The discussion on the respective choices of the ABC tolerance δ and on the prior feedback number of copies K is quite interesting, in that Umberto Picchini suggests setting δ first before increasing the number of copies. However, since the posterior gets more and more peaked as K increases, the consequences on the acceptance rate of the related ABC algorithm are unclear. Another interesting feature is that the underlying MCMC proposal on the parameter θ is an independent proposal, tuned during the warm-up stage of the algorithm. Since the tuning is repeated at each temperature, there are some loose ends as to whether or not it is a genuine Markov chain method. The same question arises when considering that additional past replicas need to be simulated when K increases. (Although they can be considered as virtual components of a vector made of an infinite number of replicas, to be used when needed.)
The simulation study involves a regular regression with 101 observations, a stochastic Gompertz model studied by Sophie Donnet, Jean-Louis Foulley, and Adeline Samson in 2010. With 12 points. And a simple Markov model. Again with 12 points. While the ABC-DC solutions are close enough to the true MLEs whenever available, a comparison with the cheaper ABC Bayes estimates would have been of interest as well.
Filed under: Books, Statistics, University life Tagged: ABC, auxiliary particle filter, data cloning, likelihood-free, Metropolis-Hastings algorithm, prior feedback, SAME algorithm, summary statistics, University of Warwick
Here are the slides I made for a short tutorial I will deliver this afternoon for the opening day of the International Workshop on Objective Bayes Methodology, O-Bayes15, held in the city of Valencià, so intricately linked with Bayesians and Bayesianism. The more so as we celebrating this time the career and life of our dear friend Susie. Celebrating with talks and stories, morning runs and afternoon drinks, laughs, tears, and more laughs, even though they cannot equate Susie’s unique and vibrant communicative laugh. I will remember how, at O’Bayes 13, Susie was the one who delivered this tutorial. And how, despite physical frailty and fatigue, she did with her usual energy and mental strength. And obviously again with laugh. I will also remember that the last time I visited Valencià, it was for Anabel Forte’s thesis defence, upon invitation from Susie, and that we had a terrific time, from discussing objective Bayes ideas to eating and drinking local goodies, to walking around the grandiose monuments just built (which presumably contributed to ruin the City of Valencià for quite a while!)
Filed under: Books, Kids, pictures, Running Tagged: Bayesian Analyis, Bayesian tests of hypotheses, non-informative priors, Spain, subjective versus objective Bayes, Susie Bayarri, València