“The Statistics and Computing journal gratefully acknowledges the contributions for this special issue, celebrating 25 years of publication. In the past 25 years, the journal has published innovative, distinguished research by leading scholars and professionals. Papers have been read by thousands of researchers world-wide, demonstrating the global importance of this field. The Statistics and Computing journal looks forward to many more years of exciting research as the field continues to expand.” Mark Girolami, Editor in Chief for The Statistics and Computing journal
Our joint [Peter Green, Krzysztof Łatuszyński, Marcelo Pereyra, and myself] review [open access!] on the important features of Bayesian computation has already appeared in the special 25th anniversary issue of Statistics & Computing! Along with the following papers
- Statistics and computing: the genesis of data science, David J. Hand, Founding Editor
- EM for mixtures: Initialization requires special care, Jean-Patrick Baudry, Gilles Celeux
- Sequential Monte Carlo methods for Bayesian elliptic inverse problems, Alexandros Beskos, Ajay Jasra, Ege A. Muzaffer, Andrew M. Stuart
- Bayesian inference via projections, Ricardo Silva, Alfredo Kalaitzis
- Computing functions of random variables via reproducing kernel Hilbert space representations, Bernhard Schölkopf, Krikamol Muandet, Kenji Fukumizu, Stefan Harmeling, Jonas Peters
- The Poisson transform for unnormalised statistical models, Simon Barthelmé, Nicolas Chopin
- Scalable estimation strategies based on stochastic approximations: classical results and new insights, Panos Toulis, Edoardo M. Airoldi
- de Finetti Priors using Markov chain Monte Carlo computations, Sergio Bacallado, Persi Diaconis, Susan Holmes
- Simulation-efficient shortest probability intervals, Ying Liu, Andrew Gelman, Tian Zheng
- Flexible parametric bootstrap for testing homogeneity against clustering and assessing the number of clusters, Christian Hennig, Chien-Ju Lin
which means very good company, indeed! And happy B’day to Statistics & Computing!
Filed under: Books, Statistics, University life Tagged: 25th anniversary, Bayesian computation, computational statistics, David Hand, Gilles Celeux, Mark Girolami, Monte Carlo Statistical Methods, open access, Statistics & Computing
- 1 – 6 February, 2016 Learning
- 8 – 12 February, 2016 Mathématical statistics
- 15 – 19 February, 2016 Processes
- 22 – 26 February, 2016 Extremes, Copulas and Actuarial Science
- 29 February – 4 March, 2016 Bayesian statistics and algorithms
Each week will see minicourses of a few hours (2-3) and advanced talks, leaving time for interactions and collaborations. (I will give one of those minicourses on Bayesian foundations.) The scientific organisers of the B’ week are Gilles Celeux and Nicolas Chopin.
The CIRM is a wonderful meeting place, in the mountains between Marseilles and Cassis, with many trails to walk and run, and hundreds of fantastic climbing routes in the Calanques at all levels. (In February, the sea is too cold to contemplate swimming. The good side is that it is not too warm to climb and the risk of bush fire is very low!) We stayed there with Jean-Michel Marin a few years ago when preparing Bayesian Essentials. The maths and stats library is well-provided, with permanent access for quiet working sessions. This is the French version of the equally fantastic German Mathematik Forschungsinstitut Oberwolfach. There will be financial support available from the supporting societies and research bodies, at least for young participants and the costs if any are low, for excellent food and excellent lodging. Definitely not a scam conference!
Filed under: Books, Kids, Mountains, pictures, Running, Statistics, Travel, University life, Wines Tagged: Bayesian Essentials with R, Bayesian statistics, bouillabaisse, calanques, Cassis, CIRM, CNRS, copulas, extremes, France, machine learning, Marseille, minicourse, SMF, stochastic processes
Earlier today, I received an invitation to give a plenary talk at a Probability and Statistics Conference in Marrakech, a nice location if any! As it came from a former graduate student from the University of Rouen (where I taught before Paris-Dauphine), and despite an already heavy travelling schedule for 2016!, I considered his offer. And looked for the conference webpage to find the dates as my correspondent had forgotten to include those. Instead of the genuine conference webpage, which had not yet been created, what I found was a fairly unpleasant scheme playing on the same conference name and location, but run by a predator conglomerate called WASET. WASET stands for World Academy of Science, Engineering, and Technology. Their website lists thousands of conferences, all in nice, touristy, places, and all with an identical webpage. For instance, there is the ICMS 2015: 17th International Conference on Mathematics and Statistics next week. With a huge “conference committee” but no a single name I can identify. And no-one from France. Actually, the website kindly offers entry by city as well as topics, which helps in spotting that a large number of ICMS conferences all take place on the same dates and at the same hotel in Paris… The trick is indeed to attract speakers with the promise of publication in a special issue of a bogus journal and to have them pay 600€ for registration and publication fees, only to have all topics mixed together in a few conference rooms, according to many testimonies I later found on the web. And as clear from the posted conference program! In the “best” of cases since other testimonies mention lost fees and rejected registrations. Testimonies also mention this tendency to reproduce the acronym of a local conference. While it is not unheard of conferences amounting to academic tourism, even from the most established scientific societies!, I am quite amazed at the scale of this enterprise, even though I cannot completely understand how people can fall for it. Looking at the website, the fees, the unrelated scientific committee, and the lack of scientific program should be enough to put those victims off. Unless they truly want to partake to academic tourism, obviously.
Filed under: Kids, Mountains, pictures, Travel, University life Tagged: academic tourism, conferences, France, ICMS 2015, Marrakech, MICPS 2016, Morocco, Paris, registration fees, Rouen, scam conference, WASET, Western Union
In the few past days, there has been so many arXiv postings of interest—presumably the NIPS submission effect!—that I cannot hope to cover them in the coming weeks! Hopefully, some will still come out on the ‘Og in a near future:
- arXiv:1506.06629: Scalable Approximations of Marginal Posteriors in Variable Selection by Willem van den Boom, Galen Reeves, David B. Dunson
- arXiv:1506.06285: The MCMC split sampler: A block Gibbs sampling scheme for latent Gaussian models by Óli Páll Geirsson, Birgir Hrafnkelsson, Daniel Simpson, Helgi Sigurðarson [also deserves a special mention for gathering only ***son authors!]
- arXiv:1506.06268: Bayesian Nonparametric Modeling of Higher Order Markov Chains by Abhra Sarkar, David B. Dunson
- arXiv:1506.06117: Convergence of Sequential Quasi-Monte Carlo Smoothing Algorithms by Mathieu Gerber, Nicolas Chopin
- arXiv:1506.06101: Robust Bayesian inference via coarsening by Jeffrey W. Miller, David B. Dunson
- arXiv:1506.05934: Expectation Particle Belief Propagation by Thibaut Lienart, Yee Whye Teh, Arnaud Doucet
- arXiv:1506.05860: Variational Gaussian Copula Inference by Shaobo Han, Xuejun Liao, David B. Dunson, Lawrence Carin
- arXiv:1506.05855: The Frequentist Information Criterion (FIC): The unification of information-based and frequentist inference by Colin H. LaMont, Paul A. Wiggins
- arXiv:1506.05757: Bayesian Inference for the Multivariate Extended-Skew Normal Distribution by Mathieu Gerber, Florian Pelgrin
- arXiv:1506.05741: Accelerated dimension-independent adaptive Metropolis by Yuxin Chen, David Keyes, Kody J.H. Law, Hatem Ltaief
- arXiv:1506.05269: Bayesian Survival Model based on Moment Characterization by Julyan Arbel, Antonio Lijoi, Bernardo Nipoti
- arXiv:1506.04778: Fast sampling with Gaussian scale-mixture priors in high-dimensional regression by Anirban Bhattacharya, Antik Chakraborty, Bani K. Mallick
- arXiv:1506.04416: Bayesian Dark Knowledge by Anoop Korattikara, Vivek Rathod, Kevin Murphy, Max Welling [a special mention for this title!]
- arXiv:1506.03693: Optimization Monte Carlo: Efficient and Embarrassingly Parallel Likelihood-Free Inference by Edward Meeds, Max Welling
- arXiv:1506.03074: Variational consensus Monte Carlo by Maxim Rabinovich, Elaine Angelino, Michael I. Jordan
- arXiv:1506.02564: Gradient-free Hamiltonian Monte Carlo with Efficient Kernel Exponential Families by Heiko Strathmann, Dino Sejdinovic, Samuel Livingstone, Zoltan Szabo, Arthur Gretton [comments coming soon!]
Filed under: R, Statistics, University life Tagged: arXiv, Bayesian statistics, MCMC, Monte Carlo Statistical Methods, Montréal, NIPS 2015, particle filter
“The results in this paper suggest that ABC can scale to large data, at least for models with a xed number of parameters, under the assumption that the summary statistics obey a central limit theorem.”
In a week rich with arXiv submissions about MCMC and “big data”, like the Variational consensus Monte Carlo of Rabinovich et al., or scalable Bayesian inference via particle mirror descent by Dai et al., Wentao Li and Paul Fearnhead contributed an impressive paper entitled Behaviour of ABC for big data. However, a word of warning: the title is somewhat misleading in that the paper does not address the issue of big or tall data per se, e.g., the impossibility to handle the whole data at once and to reproduce it by simulation, but rather the asymptotics of ABC. The setting is not dissimilar to the earlier Fearnhead and Prangle (2012) Read Paper. The central theme of this theoretical paper [with 24 pages of proofs!] is to study the connection between the number N of Monte Carlo simulations and the tolerance value ε when the number of observations n goes to infinity. A main result in the paper is that the ABC posterior mean can have the same asymptotic distribution as the MLE when ε=o(n-1/4). This is however in opposition with of no direct use in practice as the second main result that the Monte Carlo variance is well-controlled only when ε=O(n-1/2). There is therefore a sort of contradiction in the conclusion, between the positive equivalence with the MLE and
Something I have (slight) trouble with is the construction of an importance sampling function of the fABC(s|θ)α when, obviously, this function cannot be used for simulation purposes. The authors point out this fact, but still build an argument about the optimal choice of α, namely away from 0 and 1, like ½. Actually, any value different from 0,1, is sensible, meaning that the range of acceptable importance functions is wide. Most interestingly (!), the paper constructs an iterative importance sampling ABC in a spirit similar to Beaumont et al. (2009) ABC-PMC. Even more interestingly, the ½ factor amounts to updating the scale of the proposal as twice the scale of the target, just as in PMC.
Another aspect of the analysis I do not catch is the reason for keeping the Monte Carlo sample size to a fixed value N, while setting a sequence of acceptance probabilities (or of tolerances) along iterations. This is a very surprising result in that the Monte Carlo error does remain under control and does not dominate the overall error!
“Whilst our theoretical results suggest that point estimates based on the ABC posterior have good properties, they do not suggest that the ABC posterior is a good approximation to the true posterior, nor that the ABC posterior will accurately quantify the uncertainty in estimates.”
Overall, this is clearly a paper worth reading for understanding the convergence issues related with ABC. With more theoretical support than the earlier Fearnhead and Prangle (2012). However, it does not provide guidance into the construction of a sequence of Monte Carlo samples nor does it discuss the selection of the summary statistic, which has obviously a major impact on the efficiency of the estimation. And to relate to the earlier warning, it does not cope with “big data” in that it reproduces the original simulation of the n sized sample.
Filed under: Books, Statistics, University life Tagged: ABC, ABC-PMC, asymptotics, big data, iterated importance sampling, MCMC, particle system, simulation
Rémi Bardenet, Arnaud Doucet, and Chris Holmes arXived a long paper (with the above title) a month ago, paper that I did not have time to read in detail till today. The paper is quite comprehensive in its analysis of the current literature on MCMC for huge, tall, or big data. Even including our delayed acceptance paper! Now, it is indeed the case that we are all still struggling with this size difficulty. Making proposals in a wide range of directions, hopefully improving the efficiency of dealing with tall data. However, we are not there yet in that the outcome is either about as costly as the original MCMC implementation or its degree of approximation is unknown, even when bounds are available.
Most of the paper proposal is based on aiming at an unbiased estimator of the likelihood function in a pseudo-marginal manner à la Andrieu and Roberts (2009) and on a random subsampling scheme that presumes (a) iid-ness and (b) a lower bound on each term in the likelihood. It seems to me slightly unrealistic to assume that a much cheaper and tight lower bound on those terms could be available. Firmly set in the iid framework, the problem itself is unclear: do we need 10⁸ observations of a logistic model with a few parameters? The real challenge is rather in non-iid hierarchical models with random effects and complex dependence structures. For which subsampling gets much more delicate. None of the methods surveyed in the paper broaches upon such situations where the entire data cannot be explored at once.
An interesting experiment therein, based on the Glynn and Rhee (2014) unbiased representation, shows that the approach does not work well. This could lead the community to reconsider the focus on unbiasedness by coming full circle to the opposition between bias and variance. And between intractable likelihood and representative subsample likelihood.
Reading the (superb) coverage of earlier proposals made me trace back on the perceived appeal of the decomposition of Neiswanger et al. (2014) as I came to realise that the product of functions renormalised into densities has no immediate probabilistic connection with its components. As an extreme example, terms may fail to integrate. (Of course, there are many Monte Carlo features that exploit such a decomposition, from the pseudo-marginal to accept-reject algorithms. And more to come.) Taking samples from terms in the product is thus not directly related to taking samples from each term, in opposition with the arithmetic mixture representation. I was first convinced by using a fraction of the prior in each term but now find it unappealing because there is no reason the prior should change for a smaller sampler and no equivalent to the prohibition of using the data several times. At this stage, I would be much more in favour of raising a random portion of the likelihood function to the right power. An approach that I suggested to a graduate student earlier this year and which is also discussed in the paper. And considered too naïve and a “very poor approach” (Section 6, p.18), even though there must be versions that do not run afoul of the non-Gaussian nature of the log likelihood ratio. I am certainly going to peruse more thoroughly this Section 6 of the paper.
Another interesting suggestion in this definitely rich paper is the foray into an alternative bypassing the uniform sampling in the Metropolis-Hastings step, using instead the subsampled likelihood ratio. The authors call this “exchanging acceptance noise for subsampling noise” (p.22). However, there is no indication about the resulting stationary and I find the notion of only moving to higher likelihoods (or estimates of) counter to the spirit of Metropolis-Hastings algorithms. (I have also eventually realised the meaning of the log-normal “difficult” benchmark that I missed in the earlier : it means log-normal data is modelled by a normal density.) And yet another innovation along the lines of a control variate for the log likelihood ratio, no matter it sounds somewhat surrealistic.
Filed under: Books, Statistics, University life Tagged: big data, divide-and-conquer strategy, Metropolis-Hastings algorithm, parallel MCMC, subsampling, tall data
Another year attending La Rochambelle, the massive women-only race or walk against breast cancer in Caen, Normandy! With the fantastic vision of 20,000 runners in the same pink tee-shirt swarming down-town Caen and the arrival stadium. Which made it quite hard to spot my three relatives in the race! I also ran my fourth iteration of the 10k the next day, from the British War Cemetery of Cambes-en-Plaine to the Memorial for Peace in Caen. The conditions were not as optimal as last year, especially in terms of wind, and I lost one minute on my total time, as well as one position, the third V2 remaining tantalisingly a dozen meters in front of me till the end of the race. A mix of too light trainings, travel fatigue and psychological conviction I was going to end up fourth! Here are my split times, with a very fast start that showed up in the second half near 4mn/km, when the third V2 passed me.
Filed under: Kids, pictures, Running, Travel Tagged: 10k, British War Cemetery, Caen, D Day, D-Day beaches, La Rochambelle, Les Courants de la Liberté, Memorial for Peace, Normandy, veteran (V2)
[Here is a wine criticism written by Susie Bayarri in 2013 about a 2008 bottle of Altos de Losada, a wine from Leon:]
The cork is fantastic. Very good presentation and labelling of the bottle. The wine color is like dark cherry, I would almost say of the color of blood. Very bright although unfiltered. The cover is d16efinitely high. The tear is very nice (at least in my glass), slow, wide, through parallel streams… but it does not dye my glass at all.
The bouquet is its best feature… it is simply voluptuous… with ripe plums as well as vanilla, some mineral tone plus a smoky hint. I cannot quite detect which wood is used… I have always loved the bouquet of this wine…
In mouth, it remains a bit closed. Next time, I will make sure I decant it (or I will use that Venturi device) but it is nonetheless excellent… the wine is truly fruity, but complex as well (nothing like grape juice). The tannins are definitely present, but tamed and assimilated (I think they will continue to mellow) and it has just a hint of acidity… Despite its alcohol content, it remains light, neither overly sweet nor heavy. The after-taste offers a pleasant bitterness… It is just delicious, an awesome wine!
Filed under: pictures, Travel, University life, Wines Tagged: Altos de Losada, Leon, Spanish wines, Susie Bayarri, València, wine tasting
When putting this volume together with Umesh Singh, Dipak Dey, and Appaia Loganathan, my friend Satyanshu Upadhyay from Varanasi, India, asked me for a foreword. The book is now out, with chapters written by a wide variety of Bayesians. And here is my foreword, for what it’s worth:
It is a great pleasure to see a new book published on current aspects of Bayesian Analysis and coming out of India. This wide scope volume reflects very accurately on the present role of Bayesian Analysis in scientific inference, be it by statisticians, computer scientists or data analysts. Indeed, we have witnessed in the past decade a massive adoption of Bayesian techniques by users in need of statistical analyses, partly because it became easier to implement such techniques, partly because both the inclusion of prior beliefs and the production of a posterior distribution that provides a single filter for all inferential questions is a natural and intuitive way to process the latter. As reflected so nicely by the subtitle of Sharon McGrayne’s The Theory that Would not Die, the Bayesian approach to inference “cracked the Enigma code, hunted down Russian submarines” and more generally contributed to solve many real life or cognitive problems that did not seem to fit within the traditional patterns of a statistical model.
Two hundred and fifty years after Bayes published his note, the field is more diverse than ever, as reflected by the range of topics covered by this new book, from the foundations (with objective Bayes developments) to the implementation by filters and simulation devices, to the new Bayesian methodology (regression and small areas, non-ignorable response and factor analysis), to a fantastic array of applications. This display reflects very very well on the vitality and appeal of Bayesian Analysis. Furthermore, I note with great pleasure that the new book is edited by distinguished Indian Bayesians, India having always been a provider of fine and dedicated Bayesians. I thus warmly congratulate the editors for putting this exciting volume together and I offer my best wishes to readers about to appreciate the appeal and diversity of Bayesian Analysis.
Filed under: Books, Statistics, Travel, University life Tagged: Bayesian Analysis, Bayesian statistics, book foreword, India, ISBA conference, Varanasi
Our paper with Diego Salmerón and Juan Cano using integral priors for binomial regression and objective Bayesian hypothesis testing (one of my topics of interest, see yesterday’s talk!) eventually appeared in Statistica Sinica. This is Volume 25, Number 3, of July 2015 and the table of contents shows an impressively diverse range of topics.
Filed under: Books, Statistics, University life Tagged: academic journals, binomial regression, integral priors, Objective Bayesian hypothesis testing, Statistica Sinica
The workshop at the BIPM on measurement uncertainty was certainly most exciting, first by its location in the Parc de Saint Cloud in classical buildings overlooking the Seine river in a most bucolic manner…and second by its mostly Bayesian flavour. The recommendations that the workshop addressed are about revisions in the current GUM, which stands for the Guide to the Expression of Uncertainty in Measurement. The discussion centred on using a more Bayesian approach than in the earlier version, with the organisers of the workshop and leaders of the revision apparently most in favour of that move. “Knowledge-based pdfs” came into the discussion as an attractive notion since it rings a Bayesian bell, especially when associated with probability as a degree of belief and incorporating the notion of an a priori probability distribution. And propagation of errors. Or even more when mentioning the removal of frequentist validations. What I gathered from the talks is the perspective drifting away from central limit approximations to more realistic representations, calling for Monte Carlo computations. There is also a lot I did not get about conventions, codes and standards. Including a short debate about the different meanings on Monte Carlo, from simulation technique to calculation method (as for confidence intervals). And another discussion about replacing the old formula for estimating sd from the Normal to the Student’s t case. A change that remains highly debatable since the Student’s t assumption is as shaky as the Normal one. What became clear [to me] during the meeting is that a rather heated debate is currently taking place about the need for a revision, with some members of the six (?) organisations involved arguing against Bayesian or linearisation tools.
This became even clearer during our frequentist versus Bayesian session with a first talk so outrageously anti-Bayesian it was hilarious! Among other things, the notion that “fixing” the data was against the principles of physics (the speaker was a physicist), that the only randomness in a Bayesian coin tossing was coming from the prior, that the likelihood function was a subjective construct, that the definition of the posterior density was a generalisation of Bayes’ theorem [generalisation found in… Bayes’ 1763 paper then!], that objective Bayes methods were inconsistent [because Jeffreys’ prior produces an inadmissible estimator of μ²!], that the move to Bayesian principles in GUM would cost the New Zealand economy 5 billion dollars [hopefully a frequentist estimate!], &tc., &tc. The second pro-frequentist speaker was by comparison much much more reasonable, although he insisted on showing Bayesian credible intervals do not achieve a nominal frequentist coverage, using a sort of fiducial argument distinguishing x=X+ε from X=x+ε that I missed… A lack of achievement that is fine by my standards. Indeed, a frequentist confidence interval provides a coverage guarantee either for a fixed parameter (in which case the Bayesian approach achieves better coverage by constant updating) or a varying parameter (in which case the frequency of proper inclusion is of no real interest!). The first Bayesian speaker was Tony O’Hagan, who summarily shred the first talk to shreds. And also criticised GUM2 for using reference priors and maxent priors. I am afraid my talk was a bit too exploratory for the audience (since I got absolutely no question!) In retrospect, I should have given an into to reference priors.
An interesting specificity of a workshop on metrology and measurement is that they are hard stickers to schedule, starting and finishing right on time. When a talk finished early, we waited until the intended time to the next talk. Not even allowing for extra discussion. When the only overtime and Belgian speaker ran close to 10 minutes late, I was afraid he would (deservedly) get lynched! He escaped unscathed, but may (and should) not get invited again..!
Filed under: pictures, Statistics, Travel Tagged: admissibility, Bayesian inference, Bureau international des poids et mesures, confidence intervals, conventions, France, frequentist inference, MaxEnt, norms, Paris, Pavillon de Breteuil, Sèvres, subjective versus objective Bayes, workshop
[Here is the pre-Bayesian quote from Hume that students had to analyse this year for the Baccalauréat:]
The maxim, by which we commonly conduct ourselves in our reasonings, is, that the objects, of which we have no experience, resembles those, of which we have; that what we have found to be most usual is always most probable; and that where there is an opposition of arguments, we ought to give the preference to such as are founded on the greatest number of past observations. But though, in proceeding by this rule, we readily reject any fact which is unusual and incredible in an ordinary degree; yet in advancing farther, the mind observes not always the same rule; but when anything is affirmed utterly absurd and miraculous, it rather the more readily admits of such a fact, upon account of that very circumstance, which ought to destroy all its authority. The passion of surprise and wonder, arising from miracles, being an agreeable emotion, gives a sensible tendency towards the belief of those events, from which it is derived.” David Hume, An Enquiry Concerning Human Understanding,
Filed under: Books, Kids Tagged: Air France, An Enquiry Concerning Human Understanding, Baccalauréat, Bayesian foundations, David Hume, exam, finals, high school, miracles, philosophy, Scotland
A funny coincidence: as I was sitting next to Arnoldo Frigessi at the NBBC15 conference, I came upon a new question on Cross Validated about a dynamic mixture model he had developed in 2002 with Olga Haug and Håvård Rue [whom I also saw last week in Valencià]. The dynamic mixture model they proposed replaces the standard weights in the mixture with cumulative distribution functions, hence the term dynamic. Here is the version used in their paper (x>0)
where f is a Weibull density, g a generalised Pareto density, and w is the cdf of a Cauchy distribution [all distributions being endowed with standard parameters]. While the above object is not a mixture of a generalised Pareto and of a Weibull distributions (instead, it is a mixture of two non-standard distributions with unknown weights), it is close to the Weibull when x is near zero and ends up with the Pareto tail (when x is large). The question was about simulating from this distribution and, while an answer was in the paper, I replied on Cross Validated with an alternative accept-reject proposal and with a somewhat (if mildly) non-standard MCMC implementation enjoying a much higher acceptance rate and the same fit.
Filed under: R, Statistics Tagged: Arnoldo Frigessi, component of a mixture, cross validated, dynamic mixture, extremes, Havard Rue, NBBC15 conference, O-Bayes 2015, Pareto distribution, R, Reykjavik, Valencia conferences, Weibull distribution
Tonight, I am invited to give a speed-presenting talk at the Paris Machine Learning last meeting of Season 2, with the themes of DL, Recovering Robots, Vowpal Wabbit, Predcsis, Matlab, and Bayesian test [by yours truly!] The meeting will take place in Jussieu, Amphi 25, Here are my slides for the meeting:
As it happened, the meeting was quite crowded with talks and plagued with technical difficulties in transmitting talks from Berlin and Toronto, so I came to talk about three hours after the beginning, which was less than optimal for the most technical presentation of the evening. I actually wonder if I even managed to carry the main idea of replacing Bayes factors with posteriors of the mixture weight! [I had plenty of time to reflect upon this on my way back home as I had to wait for several and rare and crowded RER trains until one had enough room for me and my bike!]
Filed under: Books, Kids, pictures, Statistics, University life Tagged: Berlin, data, Jussieu, machine learning, Matlab, Paris Machine Learning Applications group, RER B, robots, Toronto, Université Pierre et Marie Curie, Vowpal
[verbatim from the call for papers:]
Statistics and Computing is preparing a special issue on Bayesian Nonparametrics, for publication by early 2016. We invite researchers to submit manuscripts for publication in the special issue. We expect that the focus theme will increase the visibility and impact of papers in the volume.
By making use of infinite-dimensional mathematical structures, Bayesian nonparametric statistics allows the complexity of a learned model to grow as the size of a data set grows. This flexibility can be particularly suited to modern data sets but can also present a number of computational and modelling challenges. In this special issue, we will showcase novel applications of Bayesian nonparametric models, new computational tools and algorithms for learning these models, and new models for the diverse structures and relations that may be present in data.
To submit to the special issue, please use the Statistics and Computing online submission system. To indicate consideration for the special issue, choose “Special Issue: Bayesian Nonparametrics” as the article type. Papers must be prepared in accordance with the Statistics and Computing journal guidelines.
Papers will go through the usual peer review process. The special issue website will be updated with any relevant deadlines and information.
Deadline for manuscript submission: August 20, 2015Guest editors:
Tamara Broderick (MIT)
Katherine Heller (Duke)
Peter Mueller (UT Austin)
Filed under: Books, Statistics, University life Tagged: algorithms, Bayesian nonparametric, call for papers, machine learning, modelling, nonparametric statistics, special issue, Statistics and Computing
“Applying various approximation strategies to the relative predictive performance derived from predictive distributions in frequentist and Bayesian inference yields many of the model comparison techniques ubiquitous in practice, from predictive log loss cross validation to the Bayesian evidence and Bayesian information criteria.”
Michael Betancourt (Warwick) just arXived a paper formalising predictive model comparison in an almost Bourbakian sense! Meaning that he adopts therein a very general representation of the issue, with minimal assumptions on the data generating process (excluding a specific metric and obviously the choice of a testing statistic). He opts for an M-open perspective, meaning that this generating process stands outside the hypothetical statistical model or, in Lindley’s terms, a small world. Within this paradigm, the only way to assess the fit of a model seems to be through the predictive performances of that model. Using for instance an f-divergence like the Kullback-Leibler divergence, based on the true generated process as the reference. I think this however puts a restriction on the choice of small worlds as the probability measure on that small world has to be absolutely continuous wrt the true data generating process for the distance to be finite. While there are arguments in favour of absolutely continuous small worlds, this assumes a knowledge about the true process that we simply cannot gather. Ignoring this difficulty, a relative Kullback-Leibler divergence can be defined in terms of an almost arbitrary reference measure. But as it still relies on the true measure, its evaluation proceeds via cross-validation “tricks” like jackknife and bootstrap. However, on the Bayesian side, using the prior predictive links the Kullback-Leibler divergence with the marginal likelihood. And Michael argues further that the posterior predictive can be seen as the unifying tool behind information criteria like DIC and WAIC (widely applicable information criterion). Which does not convince me towards the utility of those criteria as model selection tools, as there is too much freedom in the way approximations are used and a potential for using the data several times.
Filed under: Books, Statistics, University life Tagged: AIC, Bayesian model comparison, Bayesian predictive, Bourbaki, DIC, Kullback-Leibler divergence, M-open inference, marginal likelihood, posterior predictive, small worlds
Filed under: pictures, Running, Travel Tagged: driving picture, Iceland, Keflavik, moonlight, moors, sunset, Twilight
Today, I am taking part in a meeting in Paris, for an exotic change!, at the Bureau international des poids et mesures (BIPM), which looks after a universal reference for measurements. For instance, here is its definition of the kilogram:
The unit of mass, the kilogram, is the mass of the international prototype of the kilogram kept in air under three bell jars at the BIPM. It is a cylinder made of an alloy for which the mass fraction of platinum is 90 % and the mass fraction of iridium is 10 %.
And the BIPM is thus interested in the uncertainty associated with such measurements. Hence the workshop on measurement uncertainties. Tony O’Hagan will also be giving a talk in a session that opposes frequentist and Bayesian approaches, even though I decided to introduce ABC as it seems to me to be a natural notion for measurement problems (as far as I can tell from my prior on measurement problems).
Filed under: Books, Statistics, University life Tagged: ABC, ABC model choice, Bayesian model choice, BIPM, kilogram, measurement, Paris, random forests, Sèvres, Tony O'Hagan, uncertainty
This fifth volume of the “Blades” fantasy series by Kelly McCullough is entitled Drawn blades but it gives the impression the author has exhausted what she can seriously drag from the universe she created a few volumes ago. Even when resuscitating another former lover of the main character. And moving to an unknown part of the world. And bringing in new super-species, cultists, and even a petty god. Yes, a petty god, whining and poorly lying, And an anti-sect police. And a fantasy version of the surfing board. Yes again, a surfing board. Inland. Despite all those unusual features, the book feels like a sluggish copy of a million fantasy books that have mixed the themes of an awakening god awaited by fanatics followers in unlimited subterranean vaults, with the heroes eventually getting the better of the dumb followers and even of the (dumb) god. And boring a grumpy reader to sleep every single evening. The next instalment in the series, Darkened blade, just appeared, but I do not think I will return to Aral’s world again. The earlier volumes were quite enjoyable and recommended. Now comes a time to end the series!
Filed under: Books, Mountains Tagged: Ancient Blades, Bared Blade, heroic fantasy, Kelly McCullough, sword and sorcery
On Sunday afternoon, I made a brief trip to the southern coast of the Reykjanes Peninsula in an attempt to watch puffins. According to my guide book, the cliffs at Krýsuvíkurberg were populated with many species of birdlife, including the elusive puffin. However, I could only spot gulls, and more gulls, as I walked a few kilometres along those cliffs and away from the occasional 4WD stopping by the end of a dirt road [my small rental car could not handle that far]. When I was about to turn back, I spotted different birds on a small rock promontory, too far for me to tell the species, and as I was zooming at them, a puffin flew by!, so small that I almost missed it. I tried to see if any other was dwelling in the cliffs left and right but to no avail. A few minutes later, presumably the same puffin flew back and this was the end of it. Even after looking at the enlarged picture, I cannot tell what those “other” birds are: presumably Brünnich’s guillemots…
Filed under: Kids, Mountains, pictures, Travel Tagged: bird watching, Brünnich's guillemot, dirt road, gulls, Iceland, Krýsuvíkurberg, puffins, Reykjanes Peninsula