Bayesian News Feeds
inference in Kingman’s coalescent with pMCMC
As I was checking the recent stat postings on arXiv, I noticed the paper by Chen and Xie entitled inference in Kingman’s coalescent with pMCMC. (And surprisingly deposited in the machine learning subdomain.) The authors compare a pMCMC implementation for Kingman’s coalescent with importance sampling (à la Stephens & Donnelly), regular MCMC and SMC. The specifics of their pMCMC algorithm is that they simulate the coalescent times conditional on the tree structure and the tree structure conditional on the coalescent times (via SMC). The results reported in the paper consider up to five loci and agree with earlier experiments showing poor performances of MCMC algorithms (based on the LAMARC software and apparently using independent proposals). They show similar performances between importance sampling and pMCMC. While I find this application of pMCMC interesting, I wonder at the generality of the approach: when I was introduced to ABC techniques, the motivation was that importance sampling was deteriorating very quickly with the number of parameters. Here it seems the authors only considered one parameter θ. I wonder what happens when the number of parameters increases. And how pMCMC would then compare with ABC.
Filed under: Books, Statistics, University life Tagged: ABC, Gibbs sampling, importance sampling, Kingman's coalescent, pMCMC, population genetics, simulation, SMC
the last great climb [#2]
Filed under: Mountains, pictures Tagged: Antarctica, climbing pictures, Posing Productions, Queen Maud, Ulvetanna
the mind of a con man
“The tone of his talks, he said, was “Let’s not talk about the plumbing, the nuts and bolts — that’s for plumbers, for statisticians.””
As I got a tablet last week and immediately subscribed to the New York Times, I started reading papers from recent editions and got to this long article of April 26, by Yudhijit Bhattacharjee on Diederik Stapel, the Dutch professor of psychology who used fake data in dozens of papers and PhD theses.
“In his early years of research — when he supposedly collected real experimental data — Stapel wrote papers laying out complicated and messy relationships between multiple variables. He soon realized that journal editors preferred simplicity.”
This article is rather puzzling in its presentation of the facts. While Stapel acknowledges making up the data that conveniently supported his theses, the journalist’s analysis is fairly ambivalent, for instance considering that faking data is a “lesser threat to the integrity of science than the massaging of data and selective reporting of experiments”. At the beginning of the article, Stapel is shown going back to places where his experiments were supposed to have taken place, but he “could not find a location that matched the conditions described in his experiment”, making it sound as if he had forgotten…
“Science is of course about discovery, about digging to discover the truth. But it is also communication, persuasion, marketing (…) People are on the road with their talk. With the same talk. It’s like a circus (…) They give a talk in Berlin, two days later they give the same talk in Amsterdam, then they go to London. They are traveling salesmen selling their story.”
The above quote from Stapel is even more puzzling, as if giving the same talk in different places is an unacceptable academic behaviour, in par with faking data and plagiarism… I do give the same talk in several conferences and seminars, mostly to different people and I do not see a problem with this. If I persist in this behaviour, it will get boring to people who see the same talk over and over, and it should lead to me not being invited to conferences or seminars any longer, but there is nothing unethical or a-scientific in this. Another illustration of the ambivalence of both the character and the article. I frankly dislike this approach to fraud, a kind of “50 shades of lies”, where all academics get under suspicion that one way or another they also acted un-ethically and in their own interest rather than towards the advancement of Science…
Filed under: University life Tagged: Diederik Stapel, fake data, Holland, NYT, PhD, seminar, The New York Times, Tilburg
the last great climb
Last week, I got a “spam” from Posing Productions (‘climbing film fanatics’, whose dvd’s I particularly appreciate!) about their latest wallpapers. They are indeed superb and ‘Og’s readers may have to cope with a few of them in the coming week. First, here is one of the Ulvetanna climb by Leo Houlding and his team, a magnificent big wall in the Fenriskjeften Range, in Antarctica…
Filed under: Mountains, pictures Tagged: Antarctica, climbing pictures, dvd, Posing Productions, spams, Ulvetanna, wallpaper
teaching in English
A strange (if very French!) debate is taking place these days in the French main chamber, where some socialist deputies are contesting an incoming change in the regulation of university studies that would allow some courses to be taught in… English! Quelle horreur!!! Since this option has been implemented by many universities, incl. Dauphine, it means that we all are acting outside the law! I do not fear in the least being indicted for teaching R and Bayesian statistics in English… However, I find the action of these deputies missing the point: just like most other Western countries, we need to attract bright students from emerging countries in order to keep our departments open. It is unrealistic to think that those students will accept to learn French in addition to English, just because our universities are that attractive (and they are not!). Plus, our own students are asking for courses in English as they realise that their English level is not that great and that this training is more efficient than regular English courses… This position was better expressed in a Le Monde tribune a few days ago signed by several university professors, incl. Cédric Villani.
Filed under: Kids, Travel, University life Tagged: English, French, French universities, Le Monde, loi Toubon
back to Paris [#2]
detachment
One of the movies I watched during my hospitalisation is detachment, by Tony Kaye, with Adrian Brody as the lead actor. My daughter brought it to me as she remembered I was interested in it. detachment is a strong and highly original movie about the U.S. school system and the complete lack of prospects for the students in deprived suburbs. I have seen several movies of that kind in the past, some of them rather good and keeping away from the fairy tale that an exceptional teacher is enough to rescue a class cohort or even a single student from a bleak future. This one is however the most pessimistic of all, with no happy ending of any sort (except for the last minute that should have been cut). The plot is not flawless, e.g. the main teacher redemption of the young prostitute being just too unrealistic, but the burnout of the teachers, the newspeak preaching of the administration, the nihilism of the high school students, the bullying of unusual students, and the complete absolute absence of the parents (unless I am confused we only see one [screaming] mother once, no parent shows up at parents’ night and the bullying father is only a voice…) make up for those flaws. Adrian Brody is delivering a superb performance in a great movie, sadly about a terrible issue with our educational system(s)…
Filed under: Books, Kids Tagged: Adrian Brody, detachment, high school, movie review
back to Paris
micro
“Indoctrinating children in proper environmental thought was a hallmark of the green movement.” M. Crichton, micro, p. ix
I believe I read most of Michael Crichton‘s novels and this posthumous version (completed by Richard Preston) is not very different in its style and pattern from the previous ones. micro delivers an efficient fast-paced techno-thriller that filled most of one afternoon when convalescing at home. In that respect, it fills its intended role. I however feel this is one of the weakest novels in that the technological and scientific background is very poor. (The best Crichton’s novels are in my opinion The Andromeda Strain and Airframe. One of the last novels, State of Fear, carries a very anti-environmentalist and climatoskeptic message similar to the above quote.)
“Perhaps the most important lesson to be learned by direct experience is that the natural world (…) represents a complex system and therefore we cannot understand it and we cannot predict its behavior. “ M. Crichton, micro, p. x
Indeed, the plot of micro is based on the assumption that there exists a technology that can miniaturise living and non-living objects to 1/100th of their original size without any short-term impact. I remember watching as a child Fantastic Voyage, where a miniaturised submarine goes inside a blood vessel to remove a tumor, and I sat in front of a neighbour’s TV, mesmerised by the idea more than by the (weak) plot. This was in the laste 60′s. I also remember a sci’fi’ book I read when a pre-teen, with a great cover, called The Forgotten Planet: nothing truly memorable, apart from the cover, but hey this was a 1954 book. Now, micro does not use a deeper theory to justify this miniaturisation and the remainder of the plot is just as weak: I cannot imagine 1/100th humans surviving more than a few minutes in a rain forest environment! The place is crawling with insects, all way faster and far more deadly than tiny humans with a pocket knife, but the heroes conveniently meet only one dangerous insect at a time, loosing only at most one member of the group each time (sorry for the spoiler!). (In fact, the earlier Prey was much better at involving nanotechnologies. ) The grad students are very charicaturesque as well, providing biological infodump at times when they should be frozen solid with fright. Provided they had not been eaten already. The final resolution of the thriller is just… grotesque! So wait until you are sick or recovering from being sick before embarking upon this micro and no so fantastic trip!
Filed under: Books, Kids Tagged: book review, Michael Crichton, micro, science fiction, techno-thriller
i-like[d the] workshop
Indeed, I liked the i-like workshop very much. Among the many interesting talks of the past two days (incl. Cristiano Varin’s ranking of Series B as the top influential stat. journal!) , Matti Vihola’s and Nicolas Chopin’s had the strongest impact on me (to the point of scribbling in my notebook). In a joint work with Christophe Andrieu, Matti focussed on evaluating the impact of replacing the target with an unbiased estimate in a Metropolis-Hastings algorithm. In particular, they found necessary and sufficient conditions for keeping geometric and uniform ergodicity. My question (asked by Iain Murray) was whether they had derived ways of selecting the number of terms in the unbiased estimator towards maximal efficiency. I also wonder if optimal reparameterisations can be found in this sense (since unbiased estimators remain unbiased after reparameterisation).
Nicolas’ talk was about particle Gibbs sampling, a joint paper with Sumeet Singh recently arXived. I did not catch the whole detail of their method but/as I got intrigued by a property of Marc Beaumont’s algorithm (the very same algorithm used by Matti & Christophe). Indeed, the notion is that an unbiased estimator of the target distribution can be found in missing variable settings by picking an importance sampling distribution q on those variables. This representation leads to a pseudo-target Metropolis-Hastings algorithm. In the stationary regime, there exists a way to derive an “exact” simulation from the joint posterior on (parameter,latent). All the remaining/rejected latents are then distributed from the proposal q. What I do not see is how this impacts the next MCMC move since it implies generating a new sample of latent variables. I spoke with Nicolas about this over breakfast: the explanation is that this re-generated set of latent variables can be used in the denominator of the Metropolis-Hastings acceptance probability and is validated as a Gibbs step. (Incidentally, it may be seen as a regeneration event as well.)
Furthermore, I had a terrific run in the rising sun (at 5am) all the way to Kenilworth where I was a deer, pheasants and plenty of rabbits. (As well as this sculpture that now appears to me as being a wee sexist…)
Filed under: Running, Statistics, Travel, University life Tagged: ABC, empirical likelihood, i-like, likelihood-free methods, Metropolis-Hastings algorithms, Padova, pseudo-target, simulation, University of Warwick
Warwickshire snapshot
i-like workshop [talk]
Here are the slides of my talk at the i-like workshop in Warwick today:
I am really glad I could make it there and meet with many (highly supportive) friends for three days! The slides are quite similar to those I presented in Padova. I just added a few perspective slides…
Filed under: Statistics, Travel, University life Tagged: ABC, empirical likelihood, i-like, likelihood-free methods, Padova, simulation, University of Warwick
the cartoon introduction to statistics
A few weeks ago, I received a copy of The Cartoon Introduction to Statistics by Grady Klein and Alan Dabney, send by their publisher, Farrar, Staus and Giroux from New York City. (Never heard of this publisher previously, but I must admit the aggregation of those three names sounds great!) As this was an unpublished version of the book, to appear in July 2013, I first assumed my copy was a draft version, with black and white drawings using limited precision graphics.. However, when checking the already published Cartoon Introduction to Economics, I realised this was the style of Grady Klein (as reflected below).
Thus, I have to assume this is how The Cartoon Introduction to Statistics will look like when published in July… I am quite perplexed by the whole project. First, I do not see how a newcomer to the field can learn better from a cartoon with an average four sentences per page than from a regular introductory textbook. Cartoons introduce an element of fun into the explanation, with jokes and (irrelevant) side stories, but they are also distracting as readers are not always in a position to know what matters and what does not. Second, as the drawings are done in a rough style, I find this increases the potential for confusion. For instance, the above cover reproduces an example linking the histogram of a sample of averages and the normal distribution. If a reader has never heard of histograms, I do not see how he or she could gather how they are constructed in practice. The width of the bags is related to the number of persons in each bag (50 random Americans) in the story, while it should be related to the inverse of the square root of this number in the theory. Similarly, I find the explanation about confidence intervals lacking: when trying to reassure the readers about the fact that any given random sample from a population might be misleading, the authors state that “in the long run most cans [of worms] have averages in the clump under the hump [of the normal pdf]“. This is not reassuring at all: when using confidence intervals based on 10 or on 10⁵ normal observations, the corresponding 95% confidence intervals on their mean both have 95% chances to contain the true mean. The long run aspect refers to the repeated use of those intervals. (I am not even mentioning the classical fallacy of stating that “we are 99.7% confident that the population average is somewhere between -1.73 and -0.27″…)
In conclusion, I remember buying an illustrated entry to Marx’ Das Kapital when I started economics in graduate school (as a minor). This gave me a very quick idea of the purpose of the book. However, I read through the whole book to understand (or try to understand) Marx’ analysis of the economy. And the introduction did not help much in this regard. In the present setting, we are dealing with statistics, not economics, not philosophy. Having read a cartoon about the average length of worms within a can of worms is not going to help much in understanding the Central Limit Theorem and the subsequent derivation of confidence intervals. The validation of statistical methods is done through mathematics, which provides a formal language cartoons cannot reproduce.
Filed under: Books, Kids, Statistics, University life Tagged: book review, cartoon, CHANCE, introductory textbooks, Statistics, textbooks
Le Monde puzzle [#820]
The current puzzle is… puzzling:
Given the set {1,…,N} with N<61, one iterates the following procedure: take (x,y) within the set and replace the pair with the smallest divider of x+y (bar 1). What are the values of N such that the final value in the set is 61?
I find it puzzling because the way the pairs are selected impacts the final value. Or not, depending upon N. Using the following code (with factors() from the pracma package):
library(pracma) endof=function(N){ coll=1:N for (t in 1:(N-1)){ pair=sample(1:length(coll),2) dive=min(factors(sum(coll[pair]))) coll=coll[-pair] coll=c(coll,dive) } print(dive) }I got:
> for (t in 1:10) endof(10) [1] 5 [1] 3 [1] 3 [1] 5 [1] 7 [1] 5 [1] 5 [1] 7 [1] 3 [1] 3> for (t in 1:10) endof(16) [1] 2 [1] 2 [1] 2 [1] 2 [1] 2 [1] 2 [1] 2 [1] 2 [1] 2 [1] 2For N of the form 4k or 4k-1, the final number is always 2 while for N‘s of the form 4k-2 and 4k-3, the final number varies, sometimes producing 61′s. Although I could not find solutions for N less than 17… Looking more closely into the sequence leading to 61, I could not see a pattern, apart from producing prime numbers as, in, e.g.
61 = 2 + [12 + (4 + {14 + [13 + 16]})]
for N=17. (Another puzzle is that 61 plays no particular role: a long run of random calls to endof() return all prime numbers up to 79…)
Udate: Looking at the solution in today’s edition, there exist a solution for N=13 and a solution for N=14. Even though my R code fails to spot it. Of course, an exhaustive search would be feasible in these two cases. (I had also eliminated values below as not summing up to 61.) The argument for eliminating 4k and 4k-1 is that there must be an odd number of odd numbers in the collection, otherwise, the final number is always 2.
Filed under: Books, Kids, R Tagged: factors(), Le Monde, mathematical puzzle, pracma, prime numbers, R
Rによるモンテカルロ法入門
Here is the cover of the Japanese translation of our Introducing Monte Carlo methods with R book. A few year after the French translation. It actually appeared last year in August but I was not informed of this till a few weeks ago. The publisher is Maruzen, with an associated webpage if you want to order… Unless I am confused the translators are Hiro Ishida and Kazue Ishida; they deserve a major ありがとう ! And too bad George is no longer with us: this must have been the first translation of one of his books in Japanese..
Filed under: Books, R, Statistics Tagged: George Casella, Introducing Monte Carlo Methods with R, Japanese translation
awalé
Following Le Monde puzzle #810, I tried to code an R program (not reproduced here) to optimise an awalé game but the recursion was too rich for R:
Error: evaluation nested too deeply: infinite recursion / options(expressions=)?even with a very small number of holes and seeds in the awalé… Searching on the internet, it seems the computer simulation of a winning strategy for an awalé game still is an open problem! Here is a one-step R function that does not produce sure gains for the first player, far from it, as shown by the histogram below… I would need a less myopic strategy by iterating this function at least twice.
onemorestep=function(x,side){ # x current state of the awale, # side side of the awale (0 vs 1) M=length(x);N=as.integer(M/2) rewa=rep(0,M) newb=matrix(0,ncol=M,nrow=M) for (i in ((1:N)+N*side)){ if (x[i]>0){ y=x y[i]=0 for (t in 0:(x[i]-1)) y[1+(i+t)%%M]=y[1+(i+t)%%M]+1 last=1+(i+t)%%M if (side){ gain=(last<=N) }else{ gain=(last>N)} if (gain){# ending up on the right side rewa[i]=0 while (((last>0)&&(side))||((last>N)||(!side))) if ((y[last]==2)||(y[last]==3)){ rewa[i]=rewa[i]+y[last];y[last]=0 last=last-1 }else{ break()} } newb[i,]=y } } if (max(rewa)>0){ sol=order(-rewa)[1] }else{ sol=rang=((1:N)+N*side)[x[((1:N)+N*side)]>0] if (length(rang)>1) sol=sample(rang,1,prob=x[rang]^3)} return(list(reward=max(rewa),board=newb[sol,])) }Filed under: Kids, pictures, R Tagged: awalé, infinite recursion, Le Monde, R
thumbleweed [local] news
It has been about a week since I left the hospital and went back home, trying to get back in shape by resting, eating (to gain back some of the lost kg’s), sharing with my family and exercising… I foolishly tried to get back to the university once and ended the day as a wreck (esp. as I had to walk the two k’s of avenue Foch, the Line 2 métro being out of order!). Anyway, I read a lot, went back to my favourite bakery in Sceaux, had chats with neighbours, got reunited with the stray cat, and enjoyed the May sunshine while it lasted. I want to take this opportunity to give my warmest thanks to all of you who sent me greetings and good wishes, who visited me at the hospital or sent me goodies—read all the books, ate most of the macaroons and chocolates! A very special thanks to my friends in the Statistics department at BYU, for their unbelievable support! And to my mom, who came every single day… As reported in the earlier post, the thumb is gone and the wound is slowly healing, although it will require several weeks before the dressings are off for good. (Which gives me a good reason to skip washing dishes!) I dearly hope I will get the green light from the surgeon (tomorrow) for attending the i-like workshop next Wednesday!
Filed under: Books, Kids, Mountains, Running Tagged: accident, BUY, cat, i-like, operation, recovery, surgery, thumb, Université Paris Dauphine, University of Warwick
.bzh
Just read in Le Monde today that the top-level domain .bzh had been validated yesterday. This domain is intended for all things Breton. Funny to think it succeeded before the bid for a .scot domain (that you can support here)!
Filed under: Statistics Tagged: Brittany, Icann, Scotland, top-level domain
Himalayan fight
“Today, Everest is too much of a business and there are too many heroes.” Simone Moro
I was reading in Le Monde yesterday about an ugly fight occurring between a team of alpine-style climbers Ueli Steck, Simone Moro, and Jonathan Griffith) and the team of sherpas installing fixed ropes on the normal route to Everest in preparation for the hundreds of clients waiting at Base Camp. The sherpas apparently did not accept the parallel and faster climb of the three independent climbers to their tent at Camp 3, as well as resented these climbers having completed the fixed rope equipment in a gesture of good will (?). When the latter came down to Camp 2 they were faced by a mob of 100 angry sherpas ready to lynch them and had to be evacuated… Obviously, I have no further details than those I read in various interviews, from Ueli Steck‘s, to Simone Moro‘s, to the sherpas’. So I cannot judge of the responsibility of either side. However, facts are such that the team of three came closed to being stoned to death and that it had to leave Base Camp under a death threat.
This awful story reflects very badly on how much money has perverted mountaineering on Everest: while Steck and his team-mates were working on a genuine mountaineering feat by climbing a new route on a three person team, alpine-style, with no sherpa backup, the sherpas were working for half a dozen commercial companies and the millions of dollars behind (rates range from $50,000 to $100,000 per client!). Preventing climbers from climbing nearby (as long as they do not endanger anyone on the route) goes against the #1 mountaineering rule that mountains (and routes) do not belong to anyone, not even locals, and that faster teams should get priority. As shown in the book Into Thin Air, commercial expeditions have already demonstrated not caring about the #2 rule that one should bring assistance to anyone in danger: helping a perfect stranger down safely rather than bringing a $100,000 client to the top does not seem part of their equation. To be fair, Simone Moro also has commercial interests in the Himalayas through his helicopter rescue company, but I do not think this had anything to do with the current fight, besides being for the general “good—this is arguable, though, given that it gives a false sense of safety to people who should not be there…
Just a note on why I was shocked by this story: Ueli Steck is an amazing Swiss climber of Messner-ian class, who opened new routes in the Alps, Himalayas and Patagonia, often climbing them solo. (See Messner’s interview on Steck’s website, where he states that independent climbers are now perceived as parasites by sherpas.) One of his greatest feats so far is soloing the Heckmair route (the ultimate mountain climb in my opinion, see e.g. Joe Simpson’s missed attempt) on the Eiger Nordwand in 2 hours 47 minutes (it took Heckmair and his team three days in 1937).Filed under: Mountains Tagged: Eiger, Everest, Joe Simpson, Nepal, Reinhold Messner, sherpas, Simone Moro, solo climbing, Ueli Steck


