Just realised today is the second year since my climbing accident and the loss of my right thumb. Even less to say than last anniversary: while it seems almost impossible not to think about it, the handicap is quite minimal. (Actually, the only time I truly forgot about it was when I was ice-climbing in Scotland this January, the difficulty of the [first] climb meaning I had to concentrate on more immediate issues!) Teaching on the blackboard is fine when I use a chalk holder, I just bought a new bike with the easiest change of gears, and except for lacing my running shoes every morning, most chores do not take longer and, as Andrew pointed out in his March+April madness tornament, I can now get away with some missing-body-part jokes!
Filed under: pictures, Running, Travel Tagged: fog, France, morning light, morning run, Parc de Sceaux, Paris suburbs
This paper by Alon Drory was arXived last week when I was at Columbia. It reassesses Jaynes’ resolution of Bertrand’s paradox, which finds three different probabilities for a given geometric event depending on the underlying σ-algebra (or definition of randomness!). Both Poincaré and Jaynes argued against Bertrand that there was only one acceptable solution under symmetry properties. The author of this paper, Alon Drory, argues this is not the case!
“…contrary to Jaynes’ assertion, each of the classical three solutions of Bertrand’s problem (and additional ones as well!) can be derived by the principle of transformation groups, using the exact same symmetries, namely rotational, scaling and translational invariance.”
Drory rephrases as follows: “In a circle, select at random a chord that is not a diameter. What is the probability that its length is greater than the side of the equilateral triangle inscribed in the circle?”. Jaynes’ solution is indifferent to the orientation of one observer wrt the circle, to the radius of the circle, and to the location of the centre. The later is the one most discussed by Drory, as he argued that it does not involve an observer but the random experiment itself and relies on a specific version of straw throws in Jaynes’ argument. Meaning other versions are also available. This reminded me of an earlier post on Buffon’s needle and on the different versions of the needle being thrown over the floor. Therein reflecting on the connection with Bertrand’s paradox. And running some further R experiments. Drory’s alternative to Jaynes’ manner of throwing straws is to impale them on darts and throw the darts first! (Which is the same as one of my needle solutions.)
“…the principle of transformation groups does not make the problem well-posed, and well-posing strategies that rely on such symmetry considerations ought therefore to be rejected.”
In short, the conclusion of the paper is that there is an indeterminacy in Bertrand’s problem that allows several resolutions under the principle of indifference that end up with a large range of probabilities, thus siding with Bertrand rather than Jaynes.
Filed under: Books, Kids, R, Statistics, University life Tagged: Bertrand's paradox, E.T. Jaynes, Henri Poincaré, Joseph Bertrand, randomness
Filed under: pictures, Running, Travel Tagged: Air France, English train, France, morning light, morning run, Parc de Sceaux, Paris suburbs, RER B
Over the past week, I wrote a short introduction to the Metropolis-Hastings algorithm, mostly in the style of our Introduction to Monte Carlo with R book, that is, with very little theory and worked-out illustrations on simple examples. (And partly over the Atlantic on my flight to New York and Columbia.) This vignette is intended for the Wiley StatsRef: Statistics Reference Online Series, modulo possible revision. Again, nothing novel therein, except for new examples.
Filed under: Books, Kids, R, Statistics, Travel, University life Tagged: Columbia University, Introducing Monte Carlo Methods with R, Metropolis-Hastings algorithm, mixture, New York city, testing as mixture estimation, vignette
Chocolate mousse may be my favourite desert, most likely because it was considered the top desert by both my mother and my grandmother!, but it is very easy both to miss and to mess the outcome. Which is why I never eat mousse in a restaurant and why I statistically fail to make a proper mousse about 40% of the time [no reproducible experiment available, though]. My recipe is indeed both incredibly simple in its ingredients: only use a bar of baking chocolate and 7 eggs, and highly variable in the output: a wrong kind of baking chocolate, an improperly melted chocolate, an imperfect separation between whites and yolks, those are sufficient reasons for the mousse not to hold, producing instead a chocolatey custard of no particular appeal… Here is the recipe (essentially the one provided inside the Nestlé dessert wrap):Ingredients and utensils
- 7 fresh eggs, fresh as they are used raw (avoid double-yolkers as well!, as they are hellish to separate);
- a 250g bar of (bitter) cooking chocolate like [my grandmother’s] Menier dessert, Nestlé dessert, or Poulain dessert, or bitter dark chocolate, with a high cocoa butter content;
- two large bowls and a smaller one;
- an egg-beater or a cooking robot
- Break the chocolate bar into small chunks and add four spoons of water;
- Melt the chocolate by bain-marie or microwave using a temperature and duration as low as possible, and let it cool down;
- Separate white from yolk one egg at a time in a small bowl, setting aside [for another recipe] any egg where yolk could have gotten mixed with white, dropping each white into a large bowl and 5 or 6 of the yolks into another large bowl;
- beat the whites into a hard and even harder foam;
- mix well the melted chocolate with the yolks;
- incorporate as gently as possible the whites in the chocolate mix, to avoid breaking the foam, by portions of ¼, until the preparation is roughly homogeneous;
- cover and put in the fridge for at least 5 hours.
Since this recipe uses raw eggs, the mousse should be eaten rather quickly, within the next 48 hours. Although it usually vanishes in one meal!
Filed under: Kids Tagged: chocolate bar, chocolate mousse, dessert, Menier, Nestlé, Poulain, raw eggs
While I have shared this idea with many of my friends [in both senses that I mentioned it and that they shared the same feeling that it would be a great improvement], the first time I heard of the notion was in George Casella‘s kitchen in Ithaca, New York, in the early 1990’s… We were emptying the dishwasher together and George was reflecting that it would be so convenient to have a double dishwasher and remove the need to empty it altogether! Although, at the moral level, I think that we should do without dishwashers, I found this was a terrific idea and must have told the joke to most of my friends. I was nonetheless quite surprised and very pleased to receive the news from Nicole today that Fisher & Paykel (from Auckland, New Zealand) had gone all the way to produce a double dishwasher, or more exactly a double dishdrawer, perfectly suited to George’s wishes! (Pleased that she remembered the notion after all those years, not pleased with the prospect of buying a double dish washer for more than double the cost of [and a smaller volume than] a regular dishwasher!)
Filed under: Kids, Travel Tagged: Auckland, dishwasher, George Casella, Ithaca, New York, New Zealand
When I learned that Robin Hobb had started a new Assassin’s trilogy, Fitz and the Fool, I got a bit wary, given the poor sequel to the Liveship Traders trilogy I read in the hospital two years ago, and the imperfect Soldier Son trilogy… But also excited, for The Farseer Trilogy is one of the best fantasy series ever! Now that I have read Fool’s Assassin, the first volume of the trilogy, I can only wait for the second one, Fool’s Quest, to appear next summer. Unsurprisingly, reconnecting with the universe of The Farseer Trilogy is almost enough per se to make reading this book a pleasure, even though it seems to draw too much from the past volumes to gain independent praise, except in the accelerating final chapters. The style conveys too much the homely feeling of Fitz as a retired country squire, surrounded by family and friends. There is obviously a new plot, a new danger to the Six Duchies, and new characters, one of which is singularly attaching!, while Fitz remains as obtuse and whining as in earlier volumes (which is a joy to behold once again!). So now that the setting has been painstakingly and that the game is afoot, I hope the second volume will keep up with the pace of the final chapters… (Nice cover by the way if unrelated to the contents of the book, apart from the snow!)
Filed under: Books, Kids Tagged: Farseer trilogy, Fool's Assassin, hospital, Liveship Traders, Rivership Traders, Robin Hobb
Filed under: pictures, Travel, University life Tagged: bois de Boulogne, Fondation Louis Vuiton, La Défense, Paris, Université Paris Dauphine
Cristiano Varin, Manuela Cattelan and David Firth (Warwick) have written a paper on the statistical analysis of citations and index factors, paper that is going to be Read at the Royal Statistical Society next May the 13th. And hence is completely open to contributed discussions. Now, I have written several entries on the ‘Og about the limited trust I set to citation indicators, as well as about the abuse made of those. However I do not think I will contribute to the discussion as my reservations are about the whole bibliometrics excesses and not about the methodology used in the paper.
The paper builds several models on the citation data provided by the “Web of Science” compiled by Thompson Reuters. The focus is on 47 Statistics journals, with a citation horizon of ten years, which is much more reasonable than the two years in the regular impact factor. A first feature of interest in the descriptive analysis of the data is that all journals have a majority of citations from and to journals outside statistics or at least outside the list. Which I find quite surprising. The authors also build a cluster based on the exchange of citations, resulting in rather predictable clusters, even though JCGS and Statistics and Computing escape the computational cluster to end up in theory and methods along Annals of Statistics and JRSS Series B.
In addition to the unsavoury impact factor, a ranking method discussed in the paper is the eigenfactor score that starts with a Markov exploration of articles by going at random to one of the papers in the reference list and so on. (Which shares drawbacks with the impact factor, e.g., in that it does not account for the good or bad reason the paper is cited.) Most methods produce the Big Four at the top, with Series B ranked #1, and Communications in Statistics A and B at the bottom, along with Journal of Applied Statistics. Again, rather anticlimactic.
The major modelling input is based on Stephen Stigler’s model, a generalised linear model on the log-odds of cross citations. The Big Four once again receive high scores, with Series B still much ahead. (The authors later question the bias due to the Read Paper effect, but cannot easily evaluate this impact. While some Read Papers like Spiegelhalter et al. 2002 DIC do generate enormous citation traffic, to the point of getting re-read!, other journals also contain discussion papers. And are free to include an on-line contributed discussion section if they wish.) Using an extra ranking lasso step does not change things.
In order to check the relevance of such rankings, the authors also look at the connection with the conclusions of the (UK) 2008 Research Assessment Exercise. They conclude that the normalised eigenfactor score and Stigler model are more correlated with the RAE ranking than the other indicators. Which means either that the scores are good predictors or that the RAE panel relied too heavily on bibliometrics! The more global conclusion is that clusters of journals or researchers have very close indicators, hence that ranking should be conducted with more caution that it is currently. And, more importantly, that reverting the indices from journals to researchers has no validation and little information.
Filed under: Books, Statistics, University life Tagged: citation index, impact factor, JRSSB, Read paper, Royal Statistical Society, University of Warwick
“Dupuis and Robert (2003) proposed choosing the simplest model with enough explanatory power, for example 90%, but did not discuss the effect of this threshold for the predictive performance of the selected models. We note that, in general, the relative explanatory power is an unreliable indicator of the predictive performance of the submodel,”
Juho Piironen and Aki Vehtari arXived a survey on Bayesian model selection methods that is a sequel to the extensive survey of Vehtari and Ojanen (2012). Because most of the methods described in this survey stem from Kullback-Leibler proximity calculations, it includes some description of our posterior projection method with Costas Goutis and Jérôme Dupuis. We indeed did not consider prediction in our papers and even failed to include consistency result, as I was pointed out by my discussant in a model choice meeting in Cagliari, in … 1999! Still, I remain fond of the notion of defining a prior on the embedding model and of deducing priors on the parameters of the submodels by Kullback-Leibler projections. It obviously relies on the notion that the embedding model is “true” and that the submodels are only approximations. In the simulation experiments included in this survey, the projection method “performs best in terms of the predictive ability” (p.15) and “is much less vulnerable to the selection induced bias” (p.16).
Reading the other parts of the survey, I also came to the perspective that model averaging makes much more sense than model choice in predictive terms. Sounds obvious stated that way but it took me a while to come to this conclusion. Now, with our mixture representation, model averaging also comes as a natural consequence of the modelling, a point presumably not stressed enough in the current version of the paper. On the other hand, the MAP model now strikes me as artificial and linked to a very rudimentary loss function. A loss that does not account for the final purpose(s) of the model. And does not connect to the “all models are wrong” theorem.
Filed under: Books, Statistics, University life Tagged: all models are wrong, Bayesian model averaging, Bayesian model choice, Bayesian model selection, Cagliari, Kullback-Leibler divergence, MAP estimators, prior projection, Sardinia, The Bayesian Choice
My trip to work was somewhat more eventful than usual this morning: as the queue to switch to the A train was too long for my taste, I exited the Chatelet station to grab a Vélib rental bike near Le Louvre and followed the Louvre palace for a few hundred meters, until reaching a police barricade that left the remainder of the Rivoli street empty, a surreal sight on a weekday! As it happened, Beji Caid Essebsi, the president of Tunisia was on a state visit to Paris and staying at the 5-star Hotel Meurice. And just about to leave the hotel. So I hanged out there for a few minutes and watched a caravan of official dark cars leave the place, preceded by police bikes in formal dress! The ride to Dauphine was not yet straightforward as the Champs-Elysées had been closed as well, since the president was attending a commemoration (for Tunisian soldiers who died in French wars?) at Arc de Triomphe. This created a mess for traffic in the surrounding streets. Especially with pedestrians escaping from stuck buses and crowding my sidewalks! And yet another surreal sight of the Place de l’Étoile with no car. (In this end, this initiative of mine took an extra 1/2 hour on my average transit time…)
Filed under: Kids, pictures, Running, Travel Tagged: Arc de Triomphe, Champs-Elysées, Hotel Meurice, Le Louvre, Paris, RER A, Rue de Rivoli, traffic, Tunisia, Vélib
I had an interesting email exchange [or rather exchange of emails] with a (German) reader of Introducing Monte Carlo Methods with R in the past days, as he had difficulties with the validation of the accept-reject algorithm via the integral
in that it took me several iterations [as shown in the above] to realise the issue was with the notation
which seemed to be missing a density term or, in other words, be different from
What is surprising for me is that the integral
has a clear meaning as a Riemann integral, hence should be more intuitive….
Filed under: Books, R, Statistics, University life Tagged: accept-reject algorithm, George Casella, Introducing Monte Carlo Methods with R, Lebesgue integration, Riemann integration
Matt Moores, Tony Pettitt, and Kerrie Mengersen arXived a paper yesterday comparing different computational approaches to the processing of hidden Potts models and of the intractable normalising constant in the Potts model. This is a very interesting paper, first because it provides a comprehensive survey of the main methods used in handling this annoying normalising constant Z(β), namely pseudo-likelihood, the exchange algorithm, path sampling (a.k.a., thermal integration), and ABC. A massive simulation experiment with individual simulation times up to 400 hours leads to select path sampling (what else?!) as the (XL) method of choice. Thanks to a pre-computation of the expectation of the sufficient statistic E[S(Z)|β]. I just wonder why the same was not done for ABC, as in the recent Statistics and Computing paper we wrote with Matt and Kerrie. As it happens, I was actually discussing yesterday in Columbia of potential if huge improvements in processing Ising and Potts models by approximating first the distribution of S(X) for some or all β before launching ABC or the exchange algorithm. (In fact, this is a more generic desiderata for all ABC methods that simulating directly if approximately the summary statistics would being huge gains in computing time, thus possible in final precision.) Simulating the distribution of the summary and sufficient Potts statistic S(X) reduces to simulating this distribution with a null correlation, as exploited in Cucala and Marin (2013, JCGS, Special ICMS issue). However, there does not seem to be an efficient way to do so, i.e. without reverting to simulating the entire grid X…
Filed under: Books, R, Statistics, University life Tagged: ABC, Approximate Bayesian computation, Australia, Brisbane, exchange algorithm, Ising model, JCGS, path sampling, Potts model, pseudo-likelihood, QUT, Statistics and Computing
Since this is Easter weekend, and given my unreasonable fondness for hot-cross buns all year long, I tried to cook my own buns tonight, with a reasonable amount of success (!) given that it was my first attempt. I found an on-line recipe, mostly followed it, except that I added the yolk mixed with sugar to make the buns brown and shiny et voilà. If I ever try again to make those buns, I will look for an alternate way to make the [St. Andrew’s] crosses!
Filed under: Kids, pictures Tagged: cooking, Easter, Good Friday, hot-cross buns, Scotland, St. Andrew's cross
A greatly enjoyable [if a wee bit tight] visit to Columbia University for my seminar last Monday! (And a reasonably smooth trip if I forget about the screaming kids on both planes…!) Besides discussing with several faculty on our respective research interests, and explaining our views on replacing Bayes factors and posterior probabilities, views that were not strongly challenged by the seminar audience, maybe because it sounded too Bayesiano-Bayesian!, I had a great time catching up (well, almost!) with Andrew, running for one hour by the river both mornings, and even biking—does not feel worse than downtown Paris!—with Andrew a few miles to a terrific tiny Mexican restaurant in South Bronx, El Atoradero where I had a home-made tortilla (or pupusa) filled with beans and covered with hot chorizo! (The restaurant was selected as the 2014 best Mexican restaurant in New York City by The Village Voice, whatever that means. And also has a very supportive review in The New York Times.) It was so good I (very exceptionally) ordered a second serving of spicy pork huarache, which was almost as good. And kept me well-fed till the next day, when I arrived in Paris. And with enough calories to fight the cold melted snow that fell when biking back to the office at Columbia. I also had an interesting morning in a common room at Columbia, working next to graduate students and hearing their conversations about homeworks and advisors (nothing to gossip about as their comments were invariably laudatory!, maybe because they suspected me of being a mole!)
Filed under: Kids, pictures, Statistics, Travel, University life Tagged: Bronx, carnitas, Columbia University, El Atoradero, huarache, Hudson river, Manhattan, pupusa, The New York Times, The Village Voice
Filed under: Kids, pictures, Travel, University life Tagged: Broadway, Columbia University, New York city, sunset
Even though I wrote before that I do not watch TV series, I made a second exception this year with True Detective. This series was recommended to me by Judith and this was truly a good recommendation!
Contrary to my old-fashioned idea of TV series, where the same group of caricaturesque characters repeatedly meet new settings that are solved within the 50 mn each show lasts, the whole season of True Detective is a single story, much more like a very long movie with a unified plot that smoothly unfolds and gets mostly solved in the last episode. It obviously brings more strength and depth in the characters, the two investigators Rust and Marty, with the side drawback that most of the other characters, except maybe Marty’s wife, get little space. The opposition between those two investigators is central to the coherence of the story, with Rust being the most intriguing one, very intellectual, almost otherworldly, with a nihilistic discourse, and a self-destructive bent, while Marty sounds more down-to-earth, although he also caters to his own self-destructive demons… Both actors are very impressive in giving a life and an history to their characters. The story takes place in Louisiana, with great landscapes and oppressive swamps where everything seems doomed to vanish, eventually, making detective work almost useless. And where clamminess applies to moral values as much as to the weather. The core of the plot is the search for a serial killer, whose murders of women are incorporated within a pagan cult. Although this sounds rather standard for a US murder story (!), and while there are unnecessary sub-plots and unconvincing developments, the overall storyboard is quite coherent, with a literary feel, even though its writer, Nic Pizzolatto, never completed the corresponding novel and the unfolding of the plot is anything but conventional, with well-done flashbacks and multi-layered takes on the same events. (With none of the subtlety of Rashômon, where one ends up mistrusting every POV.) Most of the series takes place in current time, when the two former detectives are interrogated by detectives reopening an unsolved murder case. The transformation of Rust over 15 years is an impressive piece of acting, worth by itself watching the show! The final episode, while impressive from an aesthetic perspective as a descent into darkness, is somewhat disappointing at the story level for not exploring the killer’s perspective much further and for resorting to a fairly conventional (in the Psycho sense!) fighting scene.
Filed under: Books, pictures Tagged: HBO, Louisiana, movie review, Nick Pizzolatto, Psycho, Rashomon, serial killer, True Detective, TV series