Today, while in Warwick, I spotted on Cross Validated a question involving “minimax” in the title and hence could not help but look at it! The way I first understood the question (and immediately replied to it) was to check whether or not the standard Normal average—reduced to the single Normal observation by sufficiency considerations—is a minimax estimator of the normal mean under an interval zero-one loss defined by
where L is a positive tolerance bound. I had not seen this problem before, even though it sounds quite standard. In this setting, the identity estimator, i.e., the normal observation x, is indeed minimax as (a) it is a generalised Bayes estimator—Bayes estimators under this loss are given by the centre of an equal posterior interval—for this loss function under the constant prior and (b) it can be shown to be a limit of proper Bayes estimators and its Bayes risk is also the limit of the corresponding Bayes risks. (This is a most traditional way of establishing minimaxity for a generalised Bayes estimator.) However, this was not the question asked on the forum, as the book by Zacks it referred to stated that the standard Normal average maximised the minimal coverage, which amounts to the maximal risk under the above loss. With the strange inversion of parameter and estimator in the minimax risk:
which makes the first bound equal to 0 by equating estimator and mean μ. Note however that I cannot access the whole book and hence may miss some restriction or other subtlety that would explain for this unusual definition.
Filed under: Books, Kids, Statistics, University life Tagged: Bayes estimators, cross validated, generalised Bayes estimators, mathematical statistics, minimaxity
Here is the fifth instalment in the Peter Grant (or Rivers of London) series by Ben Aaronovitch. Thus entitled Foxglove summer, which meaning only became clear (to me) by the end of the book. I found it in my mailbox upon arrival in Warwick last Sunday. And rushed through the book during evenings, insomnia breaks and even a few breakfasts!
“It’s observable but not reliably observable. It can have a quantifiable effects, but resists any attempt to apply mathematical principles to it – no wonder Newton kept magic under wraps. It must have driven him mental. Or maybe not.” (p.297)
Either because the author has run out of ideas to centre a fifth novel on a part or aspect of London (even though the parks, including the London Zoo, were not particularly used in the previous novels), or because he could not set this new type of supernatural in a city (no spoilers!), this sequel takes place in the Western Counties, close to the Welsh border (and not so far from Brother Cadfael‘s Shrewbury!). It is also an opportunity to introduce brand new (local) characters which are enjoyable if a wee bit of a caricature! However, the inhabitants of the small village where the kidnapping investigation takes place are almost too sophisticated for Peter Grant who has to handle the enquiry all by himself, as his mentor is immobilised in London by the defection of Peter’s close colleague, Lindsey.
“We trooped off (…) down something that was not so much a path as a statistical variation in the density of the overgrowth.” (p.61)
As usual, the dialogues and monologues of Grant are the most enjoyable part of the story, along with a development of the long-in-the-coming love affair with the river goddess Beverley Brooks. And a much appreciated ambiguity in the attitude of Peter about the runaway Lindsey… The story itself reflects the limitations of a small village where one quickly repeats over and over the same trips and the same relations. Which gives a sensation of slow motion, even in the most exciting moments. The resolution of the enigma is borrowing too heavily to the fae and elves folklore, even though the final pages bring a few surprises. Nonetheless, the whole book was a page-turner for me, meaning I spent more time reading it this week than I intended or than was reasonable. No wonder for a series taking place in The Folly!
Filed under: Books, Kids, Travel Tagged: Ben Aaronnovitch, book review, England, foxglove, PC Peter Grant, Rivers of London, The Folly, unicorn, University of Warwick, Wales, Western Counties, Worcester
The seminar invitation to Edinburgh gave me the opportunity and the excuse for a quick dash to Fort William for a day of ice-climbing on Ben Nevis. The ice conditions were perfect but there was alas too much snowdrift to attempt Point Five Gully, one of the mythical routes on the Ben. (Last time, the ice was not in good conditions.) Instead, we did three pitches on three different routes, one iced rock-face near the CIC hut, the first pitch of Waterfall Gully on Carn Dearg Buttress, and the first pitch of The Curtain, again on Carn Dearg Buttress.
The most difficult climb was the first one, grading about V.5 in Scottish grade, maybe above that as the ice was rather rotten, forcing my guide Ali to place many screws. And forcing me to unscrew them! Then the difficulty got much lower, except for the V.5 start of the Waterfall, where I had to climb with hands an ice pillar as the ice-picks would not get a good grip. Breaking another large pillar in the process, fortunately mostly avoiding being hit. The final climb was quite easy, more of a snow steep slope than a true ice-climb. Too bad the second part of the route was blocked by two fellows who could not move! Anyway, it was another of those rare days on the ice, with enough choice to worry about sharing with other teams, and a terrific guide! And a reasonable day for Scotland with little snow, no rain, plenty of wind and not that cold (except when belaying!).
Filed under: Mountains, pictures, Travel Tagged: Ben Nevis, Carn Dearg Buttress, Highlands, ice climbing, point five gully, Scotland, Scottish climbing grade, waterfall
Another short paper about relabelling in mixtures was arXived last week by Pauli and Torelli. They refer rather extensively to a previous paper by Puolamäki and Kaski (2009) of which I was not aware, paper attempting to get an unswitching sampler that does not exhibit any label switching, a concept I find most curious as I see no rigorous way to state that a sampler is not switching! This would imply spotting low posterior probability regions that the chain would cross. But I should check the paper nonetheless.
Because the G component mixture posterior is invariant under the G! possible permutations, I am somewhat undeciced as to what the authors of the current paper mean by estimating the difference between two means, like μ1-μ2. Since they object to using the output of a perfectly mixing MCMC algorithm and seem to prefer the one associated with a non-switching chain. Or by estimating the probability that a given observation is from a given component, since this is exactly 1/G by the permutation invariance property. In order to identify a partition of the data, they introduce a loss function on the joint allocations of pairs of observations, loss function that sounds quite similar to the one we used in our 2000 JASA paper on the label switching deficiencies of MCMC algorithms. (And makes me wonder why this work of us is not deemed relevant for the approach advocated in the paper!) Still, having read this paper, which I find rather poorly written, I have no clear understanding of how the authors give a precise meaning to a specific component of the mixture distribution. Or how the relabelling has to be conducted to avoid switching. That is, how the authors define their parameter space. Or their loss function. Unless one falls back onto the ordering of the means or the weights which has the drawback of not connecting with the levels sets of a particular mode of the posterior distribution, meaning that imposing the constraints result in a region that contains bits of several modes.
At some point the authors assume the data can be partitioned into K≤G groups such that there is a representative observation within each group never sharing a component (across MCMC iterations) with any of the other representatives. While this notion is label invariant, I wonder whether (a) this is possible on any MCMC outcome; (b) it indicates a positive or negative feature of the MCMC sampler.; and (c) what prevents the representatives to switch in harmony from one component to the next while preserving their perfect mutual exclusion… This however constitutes the advance in the paper, namely that component dependent quantities as estimated as those associated with a particular representative. Note that the paper contains no illustration, hence that the method may prove hard to impossible to implement!
Filed under: Books, Statistics Tagged: arXiv, Bayesian estimation, finite mixtures, label switching, Matthew Stephens, pivot, University of Warwick
Filed under: Kids, pictures, Travel, University life Tagged: England, snow, United Kingdom, University of Warwick, winter, Zeeman building
Michael Gutmann and Jukka Corander arXived this paper two weeks ago. I read part of it (mostly the extended introduction part) on the flight from Edinburgh to Birmingham this morning. I find the reflection it contains on the nature of the ABC approximation quite deep and thought-provoking. Indeed, the major theme of the paper is to visualise ABC (which is admittedly shorter than “likelihood-free inference of simulator-based statistical models”!) as a regular computational method based on an approximation of the likelihood function at the observed value, yobs. This includes for example Simon Wood’s synthetic likelihood (who incidentally gave a talk on his method while I was in Oxford). As well as non-parametric versions. In both cases, the approximations are based on repeated simulations of pseudo-datasets for a given value of the parameter θ, either to produce an estimation of the mean and covariance of the sampling model as a function of θ or to construct genuine estimates of the likelihood function. As assumed by the authors, this calls for a small dimension θ. This approach actually allows for the inclusion of the synthetic approach as a lower bound on a non-parametric version.
In the case of Wood’s synthetic likelihood, two questions came to me:
- the estimation of the mean and covariance functions is usually not smooth because new simulations are required for each new value of θ. I wonder how frequent is the case where we can always use the same basic random variates for all values of θ. Because it would then give a smooth version of the above. In the other cases, provided the dimension is manageable, a Gaussian process could be first fitted before using the approximation. Or any other form of regularization.
- no mention is made [in the current paper] of the impact of the parametrization of the summary statistics. Once again, a Cox transform could be applied to each component of the summary for a better proximity of/to the normal distribution.
When reading about a non-parametric approximation to the likelihood (based on the summaries), the questions I scribbled on the paper were:
- estimating a complete density when using this estimate at the single point yobs could possibly be superseded by a more efficient approach.
- the authors study a kernel that is a function of the difference or distance between the summaries and which is maximal at zero. This is indeed rather frequent in the ABC literature, but does it impact the convergence properties of the kernel estimator?
- the estimation of the tolerance, which happens to be a bandwidth in that case, does not appear to be processed in this paper, which could explain for very low probabilities of acceptance mentioned in the paper.
- I am lost as to why lower bounds on likelihoods are relevant here. Unless this is intended for ABC maximum likelihood estimation.
Guttmann and Corander also comment on the first point, through the cost of producing a likelihood estimator. They therefore suggest to resort to regression and to avoid regions of low estimated likelihood. And rely on Bayesian optimisation. (Hopefully to be commented later.)
Filed under: Books, Statistics, University life Tagged: ABC, ABC validation, Bayesian optimisation, non-parametrics, synthetic likelihood
Just heard about a security vulnerability on Linux machines running Red Hat version 5 to 7, Ubuntu 10.04 and 12.04, Debian version 7, Fedora versions 19 and older, and SUSE versions 11 and older. The vulnerability occurs through a buffer overflow from some functions in the C library Glibc, which allows for a remote code to execute, and the fix to the problem is indicated on that NixCRaft webpage. (It is also possible to run the GHOST C code if you want to live dangerously!)
Filed under: Linux Tagged: C, Glibc, Kubuntu 12.04, Linux, security vulnerability, Ubuntu 10.10
On Wednesday afternoon, Richard Everitt and Dennis Prangle organised an RSS workshop in Reading on Bayesian Computation. And invited me to give a talk there, along with John Hemmings, Christophe Andrieu, Marcelo Pereyra, and themselves. Given the proximity between Oxford and Reading, this felt like a neighbourly visit, especially when I realised I could take my bike on the train! John Hemmings gave a presentation on synthetic models for climate change and their evaluation, which could have some connection with Tony O’Hagan’s recent talk in Warwick, Dennis told us about “the lazier ABC” version in connection with his “lazy ABC” paper, [from my very personal view] Marcelo expanded on the Moreau-Yoshida expansion he had presented in Bristol about six months ago, with the notion that using a Gaussian tail regularisation of a super-Gaussian target in a Langevin algorithm could produce better convergence guarantees than the competition, including Hamiltonian Monte Carlo, Luke Kelly spoke about an extension of phylogenetic trees using a notion of lateral transfer, and Richard introduced a notion of biased approximation to Metropolis-Hasting acceptance ratios, notion that I found quite attractive if not completely formalised, as there should be a Monte Carlo equivalent to the improvement brought by biased Bayes estimators over unbiased classical counterparts. (Repeating a remark by Persi Diaconis made more than 20 years ago.) Christophe Andrieu also exposed some recent developments of his on exact approximations à la Andrieu and Roberts (2009).
Since those developments are not yet finalised into an archived document, I will not delve into the details, but I found the results quite impressive and worth exploring, so I am looking forward to the incoming publication. One aspect of the talk which I can comment on is related to the exchange algorithm of Murray et al. (2006). Let me recall that this algorithm handles double intractable problems (i.e., likelihoods with intractable normalising constants like the Ising model), by introducing auxiliary variables with the same distribution as the data given the new value of the parameter and computing an augmented acceptance ratio which expectation is the targeted acceptance ratio and which conveniently removes the unknown normalising constants. This auxiliary scheme produces a random acceptance ratio and hence differs from the exact-approximation MCMC approach, which target directly the intractable likelihood. It somewhat replaces the unknown constant with the density taken at a plausible realisation, hence providing a proper scale. At least for the new value. I wonder if a comparison has been conducted between both versions, the naïve intuition being that the ratio of estimates should be more variable than the estimate of the ratio. More generally, it seemed to me [during the introductory part of Christophe’s talk] that those different methods always faced a harmonic mean danger when being phrased as expectations of ratios, since those ratios were not necessarily squared integrable. And not necessarily bounded. Hence my rather gratuitous suggestion of using other tools than the expectation, like maybe a median, thus circling back to the biased estimators of Richard. (And later cycling back, unscathed, to Reading station!)
On top of the six talks in the afternoon, there was a small poster session during the tea break, where I met Garth Holloway, working in agricultural economics, who happened to be a (unsuspected) fan of mine!, to the point of entitling his poster “Robert’s paradox”!!! The problem covered by this undeserved denomination connected to the bias in Chib’s approximation of the evidence in mixture estimation, a phenomenon that I related to the exchangeability of the component parameters in an earlier paper or set of slides. So “my” paradox is essentially label (un)switching and its consequences. For which I cannot claim any fame! Still, I am looking forward the completed version of this poster to discuss Garth’s solution, but we had a beer together after the talks, drinking to the health of our mutual friend John Deely.
Filed under: Statistics, Travel, University life Tagged: BayesComp, biking, Chib's approximation, doubly intractable problems, exchange algorithm, exchangeability, John Deely, label switching, mixture estimation, Purdue University, trains, University of Oxford, University of Reading
On Cross Validated, I had a rather extended discussion with a user about a probability density
as I thought it could be decomposed in two manageable conditionals and simulated by Gibbs sampling. The first component led to a Gumbel like density
wirh y being restricted to either (0,1) or (1,∞) depending on β. The density is bounded and can be easily simulated by an accept-reject step. The second component leads to
which offers the slight difficulty that it is not integrable when the first component is less than 1! So the above density does not exist (as a probability density).
What I found interesting in this question was that, for once, the Gibbs sampler was the solution rather than the problem, i.e., that it pointed out the lack of integrability of the joint. (What I found less interesting was that the user did not acknowledge a lengthy discussion that we had previously about the Gibbs implementation and that he erased, that he lost interest in the question by not following up on my answer, a seemingly common feature of his‘, and that he did not provide neither source nor motivation for this zombie density.)
Filed under: Kids, R, Statistics, University life Tagged: cross validated, Gibbs sampling, Gumbel distribution, improper posteriors, zombie density
I spent [most of] the past week in Oxford in connection with our joint OxWaSP PhD program, which is supported by the EPSRC, and constitutes a joint Centre of Doctoral Training in statistical science focussing on data-intensive environments and large-scale models. The first cohort of a dozen PhD students had started their training last Fall with the first year spent in Oxford, before splitting between Oxford and Warwick to write their thesis. Courses are taught over a two week block, with a two day introduction to the theme (Bayesian Statistics in my case), followed by reading, meetings, daily research talks, mini-projects, and a final day in Warwick including presentations of the mini-projects and a concluding seminar. (involving Jonty Rougier and Robin Ryder, next Friday). This approach by bursts of training periods is quite ambitious in that it requires a lot from the students, both through the lectures and in personal investment, and reminds me somewhat of a similar approach at École Polytechnique where courses are given over fairly short periods. But it is also profitable for highly motivated and selected students in that total immersion into one topic and a large amount of collective work bring them up to speed with a reasonable basis and the option to write their thesis on that topic. Hopefully, I will see some of those students next year in Warwick working on some Bayesian analysis problem!
On a personal basis, I also enjoyed very much my time in Oxford, first for meeting with old friends, albeit too briefly, and second for cycling, as the owner of the great Airbnb place I rented kindly let me use her bike to go around, which allowed me to go around quite freely! Even on a train trip to Reading. As it was a road racing bike, it took me a trip or two to get used to it, especially on the first day when the roads were somewhat icy, but I enjoyed the lightness of it, relative to my lost mountain bike, to the point of considering switching to a road bike for my next bike… I had also some apprehensions with driving at night, which I avoid while in Paris, but got over them until the very last night when I had a very close brush with a car entering from a side road, which either had not seen me or thought I would let it pass. Gave me the opportunity of shouting Oï!
Filed under: Books, Kids, pictures, Statistics, Travel, University life Tagged: airbnb, Bayesian statistics, EPSRC, mountain bike, PhD course, PhD students, slides, slideshare, stolen bike, The Bayesian Choice, University of Oxford, University of Warwick
Here are two examples of animal “face” tee-shirts I saw advertised in The New York Times and that I would not consider wearing. At any time.
Filed under: Kids, pictures Tagged: animals, Asian lady beetle, fashion, tarsier, tee-shirt, The New York Times
Yesterday, I was all too briefly in Edinburgh for a few hours, to give a seminar in the School of Mathematics, on the random forests approach to ABC model choice (that was earlier rejected). (The slides are almost surely identical to those used at the NIPS workshop.) One interesting question at the end of the talk was on the potential bias in the posterior predictive expected loss, bias against some model from the collection of models being evaluated for selection. In the sense that the array of summaries used by the random forest could fail to capture features of a particular model and hence discriminate against it. While this is correct, there is no fundamental difference with implementing a posterior probability based on the same summaries. And the posterior predictive expected loss offers the advantage of testing, that is, for representative simulations from each model, of returning the corresponding model prediction error to highlight poor performances on some models. A further discussion over tea led me to ponder whether or not we could expand the use of random forests to Bayesian quantile regression. However, this would imply a monotonicity structure on a collection of random forests, which sounds daunting…
My stay in Edinburgh was quite brief as I drove to the Highlands after the seminar, heading to Fort William, Although the weather was rather ghastly, the traffic was fairly light and I managed to get there unscathed, without hitting any of the deer of Rannoch Mor (saw one dead by the side of the road though…) or the snow banks of the narrow roads along Loch Lubnaig. And, as usual, it still was a pleasant feeling to drive through those places associated with climbs and hikes, Crianlarich, Tyndrum, Bridge of Orchy, and Glencoe. And to get in town early enough to enjoy a quick dinner at The Grog & Gruel, reflecting I must have had half a dozen dinners there with friends (or not) over the years. And drinking a great heather ale to them!
Filed under: Mountains, pictures, Statistics, Travel, University life, Wines Tagged: ABC, ABC model choice, Edinburgh, Fort William, quantile regression, random forests, Scotland, The Grog & Gruel, University of Edinburgh
I Remember You: A Ghost Story is another Icelandic novel by Yrsa Sigurdardottir, that I bought more because it takes place in Iceland than because of its style, as I found the previous novel was somewhat missing in its plot. Still, I was expecting better, as the novel won the 2012 Icelandic Crime Fiction Award. Alas, I should have been paying more attention to the subtitle “A ghost story”, since this is indeed a ghost story of a most traditional nature (I mean, without the deep humour of Rivers of London!), where the plot itself is incomprehensible (or inexistent) without taking into account the influence and even actions of ghosts! I know I should have been warned by the earlier volume since there as well some characters were under the influence, but I had thought it was more of a psychological disorder than a genuine part of the story! As I do not enjoy in the least ghost stories of that kind, having grown out of the scary parts, it was a ghastly drag to finish this book, especially because the plot is very shroud-thin and (spoilers, spoilers!) the very trip and subsequent behaviour of the three characters in the deserted village is completely irrational (even prior to their visitation by a revengeful ghost!). The motives for all characters that end up in the haunted place are similarly flimsy… The connections between the characters are fairly shallow and the obvious affair between two of them takes hundreds of pages to be revealed. The very last pages of the book see the rise of a new ghost, maybe in prevision of a new novel. No matter what, this certainly is my last book by Sigurdardottir and I will rather wait for the next Indriðason to increase my collection of Icelandic Noir…! Keeping away from the fringe that caters to the supposedly widespread Icelandic belief in ghosts and trolls!!!
Filed under: Books, Travel Tagged: Arnaldur Indriðason, ghosts, horror, Iceland noir, Rivers of London, Yrsa Sigurðardóttir
Our random forest paper was alas rejected last week. Alas because I think the approach is a significant advance in ABC methodology when implemented for model choice, avoiding the delicate selection of summary statistics and the report of shaky posterior probability approximation. Alas also because the referees somewhat missed the point, apparently perceiving random forests as a way to project a large collection of summary statistics on a limited dimensional vector as in the Read Paper of Paul Fearnhead and Dennis Prarngle, while the central point in using random forests is the avoidance of a selection or projection of summary statistics. They also dismissed ou approach based on the argument that the reduction in error rate brought by random forests over LDA or standard (k-nn) ABC is “marginal”, which indicates a degree of misunderstanding of what the classification error stand for in machine learning: the maximum possible gain in supervised learning with a large number of classes cannot be brought arbitrarily close to zero. Last but not least, the referees did not appreciate why we mostly cannot trust posterior probabilities produced by ABC model choice and hence why the posterior error loss is a valuable and almost inevitable machine learning alternative, dismissing the posterior expected loss as being not Bayesian enough (or at all), for “averaging over hypothetical datasets” (which is a replicate of Jeffreys‘ famous criticism of p-values)! Certainly a first time for me to be rejected based on this argument!
Filed under: Books, Statistics, University life Tagged: ABC, ABC model choice, Bayesian Analysis, classification, Harold Jeffreys, random forests, Read paper, summary statistics
Thomas Schön, Uppsala University
Filed under: pictures, R, Statistics, Travel, University life, Wines Tagged: ENSAE, Monte Carlo Statistical Methods, Paris, sequential Monte Carlo, SMC 2015, workshop
This blog post was contributed by my friend Julien Cornebise, as a reprint of a column he wrote for the latest ISBA Bulletin.
This article is an occasion to pay forward ever so slightly, by encouraging current Ph.D. candidates on their path, the support ISBA gave me. Four years ago, I was honored and humbled to receive the ISBA 2010 Savage Award, category Theory and Methods, for my Ph.D. dissertation defended in 2009. Looking back, I can now testify how much this brought to me both inside and outside of Academia.
Inside Academia: confirming and mitigating the widely-shared post-graduate’s impostor syndrome
Upon hearing of the great news, a brilliant multi-awarded senior researcher in my lab very kindly wrote to me that such awards meant never having to prove one’s worth again. Although genuinely touched by her congratulations, being far less accomplished and more junior than her, I felt all the more responsible to prove myself worth of this show of confidence from ISBA. It would be rather awkward to receive such an award only to fail miserably shortly after.
This resonated deeply with the shared secret of recent PhDs, discovered during my year at SAMSI, a vibrant institution where half a dozen new postdocs arrive each year: each and every one of us, fresh Ph.D.s from some of the best institutions (Cambridge, Duke, Waterloo, Paris…) secretly suffered the very same impostor syndrome. We were looking at each other’s CV/website and thinking “jeez! this guy/girl across the door is an expert of his/her field, look at all he/she has done, whereas I just barely scrape by on my own research!” – all the while putting up a convincing façade of self-assurance in front of audiences and whiteboards, to the point of apparent cockiness. Only after candid exchanges in SAMSI’s very open environment did we all discover being in the very same mindset.
In hindsight the explanation is simple: each young researcher in his/her own domain has the very expertise to measure how much he/she still does not know and has yet to learn, while he/she hears other young researchers, experts in their own other field, present results not as familiar to him/her, thus sounding so much more advanced. This take-away from SAMSI was perfectly confirmed by the Savage Award: yes, maybe indeed, I, just like my other colleagues, might actually know something relatively valuable, and my scraping by might just be not so bad – as is also the case of so many of my young colleagues.
Of course, impostor syndrome is a clingy beast and, healthily, I hope to never get entirely over it – merely overcoming it enough to say “Do not worry, thee young candidate, thy doubts pave a path well trodden”.
A similar message is also part of the little-known yet gem of a guide “How to do Research at MIT AI Lab – Emotional Factors”, relevant far beyond its original lab. I recommend it to any Ph.D. student; the feedback from readers is unanimous.
Outside Academia: incredibly increased readability
After two post-docs, and curious to see what was out there in atypical paths, I took a turn out of purely academic research, first as an independent consultant, then recruited out of the blue by a start-up’s recruiter, and eventually doing my small share to help convince investors. I discovered there another facet of ISBA’s Savage Award: tremendous readability.
In Academia, the dominating metric of quality is the length of the publication list – a debate for another day. Outside of Academia, however, not all interlocutors know how remarkable is a JRSSB Read Paper, or an oral presentation at NIPS, or a publication in Nature.
This is where international learned societies, like ISBA, come into play: the awards they bestow can serve as headline-grabbing material in a biography, easily spotted. The interlocutors do not need to be familiar with the subtleties of Bayesian Analysis. All they see is a stamp of approval from an official association of this researcher’s peers. That, in itself, is enough of a quality metric to pass the first round of contact, raise interest, and get the chance to further the conversation.
First concrete example: the recruiter who contacted me for the start-up I joined in 2011 was tasked to find profiles for an Applied position. The Savage Award on the CV grabbed his attention, even though he had no inkling what Adaptive Sequential Monte Carlo Methods were, nor if they were immediately relevant to the start-up. Passing it to the start-up’s managers, they immediately changed focus and interviewed me for their Research track instead: a profile that was not what they were looking for originally, yet stood out enough to interest them for a position they had not thought of filling via a recruiter – and indeed a unique position that I would never have thought to find this way either!
Second concrete example, years later, hard at work in this start-up’s amazing team: investors were coming for a round of technical due diligence. Venture capitals sent their best scientists-in-residence to dive deeply into the technical details of our research. Of course what matters in the end is, and forever will be, the work that is done and presented. Yet, the Savage Award was mentioned in the first line of the biography that was sent ahead of time, as a salient point to give a strong first impression of our research team.
Advices to Ph.D. Candidates: apply, you are the world best expert on your topic
That may sound trivial, but the first advice: apply. Discuss with your advisor the possibility to put your dissertation up for consideration. This might sound obvious to North-American students, whose educative system is rife with awards for high-performing students. Not so much in France, where those would be at odds with the sometimes over-present culture of égalité in the younger-age public education system. As a cultural consequence, few French Ph.D. students, even the most brilliant, would consider putting up their dissertation for consideration. I have been very lucky in that regard to benefit from the advice of a long-term Bayesian, who offered to send it for me – thanks again Xi’an! Not all students, regardless how brilliant their work, are made aware of this possibility.
The second advice, closely linked: do not underestimate the quality of your work. You are the foremost expert in the entire world on your Ph.D. topic. As discussed above, it is all too easy to see how advanced are the maths wielded by your office-mate, yet oversee the as-much-advanced maths you are juggling on a day-to-day basis, more familiar to you, and whose limitations you know better than anyone else. Actually, knowing these very limitations is what proves you are an expert.
A word of thanks and final advice
Finally, a word of thanks. I have been incredibly lucky, throughout my career so far, to meet great people. My dissertation already had four pages of acknowledgements: I doubt the Bulletin’s editor would appreciate me renewing (and extending!) them here. They are just as heartfelt today as they were then. I must, of course, add ISBA and the Savage Award committee for their support, as well as all those who, by their generous donations, allow the Savage Fund to stay alive throughout the years.
Of interest to Ph.D. candidates, though, one special mention of a dual tutelage system, that I have seen successfully at work many times. The most senior, a professor with the deep knowledge necessary to steer the project brings his endless fonts of knowledge collected over decades, wrapped in hardened tough-love. The youngest, a postdoc or fresh assistant professor, brings virtuosity, emulation and day-to-day patience. In my case they were Pr. Éric Moulines and Dr. Jimmy Olsson. That might be the final advice to a student: if you ever stumble, as many do, as I most surely did, because Ph.D. studies can be a hell of a roller-coaster to go through, reach out to the people around you and the joint set of skills they want to offer you. In combination, they can be amazing, and help you open doors that, in retrospect, can be worth all the efforts.
Julien Cornebise, Ph.D.
Filed under: Kids, Statistics, University life Tagged: adaptive Monte Carlo, Bayesian Analysis, ISBA, ISBA Bulletin, Julien Cornebise, Savage award
The next Nordic-Baltic Biometric conference will take place in Reykjavik, next June, a few days after the O-Bayes 15 meeting in València. I will attend the conference as the organisers were kind enough to invite me to give a talk, with high hopes to take a few days off to go hiking day and night! The registration is now open, as is the call for abstracts.
Filed under: Mountains, pictures, Statistics, Travel, University life Tagged: biometry, conference, hiking, Iceland, invited talk, Nordic-Baltic Biometric conference, O-Bayes 2015, Reykjavik
Today, I took a look at a recently arXived paper posted in physics, lifting – A non reversible MCMC algorithm by Marija Vucleja, but I simply could not understand the concept of lifting. Presumably because of the physics perspective. And also because the paper is mostly a review, referring to the author’s earlier work. The notion of lifting is to create a duplicate of a regular Markov chain with given stationary distribution towards cancelling reversibility and hence speeding up the exploration of the state space. The central innovation in the paper seems to be in imposing a lifted reversibility, which means using the reverse dynamics on the lifted version of the chain, that is, the dual proposal
However, the paper does not explicit how the resulting Markov transition matrix on the augmented space is derived from the original matrix. I now realise my description is most likely giving the impression of two coupled Markov chains, which is not the case: the new setting is made of a duplicated sample space, in the sense of Nummelin split chain (but without the specific meaning for the binary variable found in Nummelin!). In the case of the 1-d Ising model, the implementation of the method means for instance picking a site at random, proposing to change its spin value by a Metropolis acceptance step and then, if the proposal is rejected, possibly switching to the corresponding value in the dual part of the state. Given the elementary proposal in the first place, I fail to see where the improvement can occur… I’d be most interested in seeing a version of this lifting in a realistic statistical setting.
Filed under: Books, Statistics, University life Tagged: arXiv, MCMC algorithms, reversible Markov chain