Bayesian News Feeds
In the common room of the Department of Mathematics at the University of Warwick [same building as the Department of Statistics], there is a box for book exchanges and I usually take a look at each visit for a possible exchange. In October, I thus picked Jo Nesbø’s The Redbreast in exchange for maybe The Rogue Male. However, it stood on my office bookcase for another three months before I found time to read this early (2000) instalment in the Harry Hole series. With connections with the earliest Redeemer.
This is a fairly good if not perfect book, with a large opening into Norway’s WW II history and the volunteers who joined Nazi Germany to fight on the Eastern Front. And the collaborationist government of Vidkin Quissling. I found most interesting this entry into this period and the many parallels with French history at the same time. (To the point that quisling is now a synonym for collaborator, similar to pétainiste in French.) This historical background has some similarities with Camilla Lackberg‘s Hidden Child I read a while ago but on a larger and broader scale. Reminiscences and episodes from 1940-1944 take a large part of the book. And rightly so, as the story during WW II explains a lot of the current plot. While this may sound like an easy story-line, the plot also dwells a lot on skinheads and neo-Nazis in Olso. While Hole’s recurrent alcoholism irks me in the long run (more than Rebus‘ own alcohol problem, for some reason!), the construction of the character is quite well-done, along with a reasonable police force, even though both Hole’s inquest and the central crime of the story are stretching on and beyond belief, with too many coincidences. And a fatal shot by the police leads to very little noise and investigation, in a country where the murder rate is one of the lowest in the World and police officers do not carry guns. Except in Nesbø’s novels! Still, I did like the novel to the point of spending most of a Sunday afternoon on it, with the additional appeal of most of it taking place in Oslo. Definitely a page turner.
Filed under: Books, Travel, University life Tagged: book rev, Norway, Oslo, Pétain, pétainiste, Quissling, Rødstrupe, WW II
In what could have been the most expensive raclette ever, I almost get rid of my oven! Last weekend, to fight the ongoing cold wave, we decided to have a raclette with mountain cheese and potatoes, but the raclette machine (mostly a resistance to melt the cheese) had an electric issue and kept blowing the meter. We then decided to use the over to melt the cheese but, while giving all signs of working, it would not heat. Rather than a cold raclette, we managed with the microwave (!), but I though the oven had blown as well. The next morning, I still checked on the web for similar accidents and found the explanation: by pressing the proper combination of buttons, we had succeeded to switch the over into the demo mode, used by shops to run the oven with no heating. The insane part of this little [very little] story is that nowhere in the manual appeared any indication of an existing demo mode and of a way of getting back to normal! After pushing combinations of buttons at random, I eventually got the solution and the oven is again working, instead of standing in the recycling bin.
Filed under: Kids, Wines Tagged: cooking, electronics, kitchen, manual, oven, raclette
“En fait, il y a deux pays qui ont envahi la Russie, c’est la France et l’Allemagne…”
Hartig et al. published a while ago (2011) a paper in Ecology Letters entitled “Statistical inference for stochastic simulation models – theory and application”, which is mostly about ABC. (Florian Hartig pointed out the paper to me in a recent blog comment. about my discussion of the early parts of Guttman and Corander’s paper.) The paper is largely a tutorial and it reminds the reader about related methods like indirect inference and methods of moments. The authors also insist on presenting ABC as a particular case of likelihood approximation, whether non-parametric or parametric. Making connections with pseudo-likelihood and pseudo-marginal approaches. And including a discussion of the possible misfit of the assumed model, handled by an external error model. And also introducing the notion of informal likelihood (which could have been nicely linked with empirical likelihood). A last class of approximations presented therein is called rejection filters and reminds me very much of Ollie Ratman’s papers.
“Our general aim is to find sufficient statistics that are as close to minimal sufficiency as possible.” (p.819)
As in other ABC papers, and as often reported on this blog, I find the stress on sufficiency a wee bit too heavy as those models calling for approximation almost invariably do not allow for any form of useful sufficiency. Hence the mathematical statistics notion of sufficiency is mostly useless in such settings.
“A basic requirement is that the expectation value of the point-wise approximation of p(Sobs|φ) must be unbiased” (p.823)
As stated above the paper is mostly in tutorial mode, for instance explaining what MCMC and SMC methods are. As illustrated by the above figure. There is however a final and interesting discussion section on the impact of estimating the likelihood function at different values of the parameter. However, the authors seem to focus solely on pseudo-marginal results to validate this approximation, hence on unbiasedness, which does not work for most ABC approaches that I know. And for the approximations listed in the survey. Actually, it would be quite beneficial to devise a cheap tool to assess the bias or extra-variation due to the use of approximative techniques like ABC… A sort of 21st Century bootstrap?!
Filed under: Books, Statistics, University life Tagged: ABC, ABC validation, Bayesian optimisation, non-parametrics, sufficiency, synthetic likelihood
Subhash Lele recently arXived a short paper entitled “Is non-informative Bayesian analysis appropriate for wildlife management: survival of San Joaquin Kit fox and declines in amphibian populations”. (Lele has been mentioned several times on this blog in connection with his data-cloning approach that mostly clones our own SAME algorithm.)
“The most commonly used non-informative priors are either the uniform priors or the priors with very large variances spreading the probability mass almost uniformly over the entire parameter space.”
The main goal of the paper is to warn, even better “to disabuse the ecologists of the notion that there is no difference between non-informative Bayesian inference and likelihood-based inference and that the philosophical underpinnings of statistical inference are irrelevant to practice.” The argument advanced by Lele is simply that two different parametrisations should lead to two compatible priors and that, if they do not not, this exhibits an unacceptable impact of the prior modelling on the resulting inference, while likelihood-based inference [obviously] does not depend on parametrisation.
The first example in the paper is a dynamic linear model of a fox population series when using a uniform U(0,1) prior on a parameter b against a Ga(100,100) prior on -a/b. (The normal prior a is the same on both.) I do not find the opposition between the two posteriors in the least surprising as the modelling starts by assuming different supports on the parameter b. And both are highly “informative” in that there is no intrinsic constraint on b that could justify the (0,1) support, as illustrated by the second choice when b is unconstrained, varying on (-15,15) or (-0.0015,0.0015) depending on how the Ga(100,100) prior is parametrised.
and the paper opposes a uniform prior on p,q to a normal N(0,10^3) prior on the logit transforms of p and q. [With an obvious typo at the top of page 10.] As shown on the above graph, the two priors on p are immensely different, so should lead to different posteriors in a weakly informative setting as a Bernoulli experiment. Even with a few hundred individuals. A somewhat funny aspect of this study is that Lele opposes the uniform prior to the Jeffreys Be(.5,.5) prior as being “nowhere close to looking like what one would consider a non-informative prior”, without noticing that the logit parametrisation normal prior leads to an even more peaked prior…
“Even when Jeffreys prior can be computed, it will be difficult to sell this prior as an objective prior to the jurors or the senators on the committee. The construction of Jeffreys and other objective priors for multi-parameter models poses substantial mathematical difficulties.”
I find it rather surprising that a paper can be dedicated to the comparison of two arbitrary prior distributions on two fairly simplistic models towards the global conclusion that “non-informative priors neither ‘let the data speak’ nor do they correspond (even roughly) to likelihood analysis.” In this regard, the earlier critical analysis of Seaman et al., to which my PhD student Kaniav Kamary and I replied, had a broader scope.
Filed under: Books, pictures, Statistics, University life Tagged: data cloning, non-informative priors, SAME algorithm
A question on Cross Validated led me to realise I had never truly considered the issue of periodic Gibbs samplers! In MCMC, non-aperiodic chains are a minor nuisance in that the skeleton trick of randomly subsampling the Markov chain leads to a aperiodic Markov chain. (The picture relates to the skeleton!) Intuitively, while the systematic Gibbs sampler has a tendency to non-reversibility, it seems difficult to imagine a sequence of full conditionals that would force the chain away from the current value..!In the discrete case, given that the current state of the Markov chain has positive probability for the target distribution, the conditional probabilities are all positive as well and hence the Markov chain can stay at its current value after one Gibbs cycle, with positive probabilities, which means strong aperiodicity. In the continuous case, a similar argument applies by considering a neighbourhood of the current value. (Incidentally, the same person asked a question about the absolute continuity of the Gibbs kernel. Being confused by our chapter on the topic!!!)
Filed under: Books, Kids, pictures, Statistics, Travel, University life Tagged: aperiodicity, convergence, cross validated, Gibbs sampler, Markov chain, MCMC algorithms, Monte Carlo Statistical Methods, skeleton chain
A study [re]published three days ago in both The New York Times and the BBC The Guardian reproduced the conclusion of an article in the Journal of the American College of Cardiology that strenuous and long-distance jogging (or more appropriately running) could have a negative impact on longevity! And that the best pace is around 8km/h, just above a brisk walk! Quite depressing… However, this was quickly followed by other articles, including this one in The New York Times, pointing out the lack of statistical validation in the study and the ridiculously small number of runners in the study. I am already feeling better (and ready for my long run tomorrow morning!), but appalled all the same by the lack of standards of journals publishing statistically void studies. I know, nothing new there…
Filed under: Running, Statistics Tagged: long distance running, medical studies, running injury, statistical significance
The University of Warwick is one of the five UK Universities (Cambridge, Edinburgh, Oxford, Warwick and UCL) to be part of the new Alan Turing Institute.To quote from the University press release, “The Institute will build on the UK’s existing academic strengths and help position the country as a world leader in the analysis and application of big data and algorithm research. Its headquarters will be based at the British Library at the centre of London’s Knowledge Quarter.” The Institute will gather researchers from mathematics, statistics, computer sciences, and connected fields towards collegial and focussed research , which means in particular that it will hire a fairly large number of researchers in stats and machine-learning in the coming months. The Department of Statistics at Warwick was strongly involved in answering the call for the Institute and my friend and colleague Mark Girolami will the University leading figure at the Institute, alas meaning that we will meet even less frequently! Note that the call for the Chair of the Alan Turing Institute is now open, with deadline on March 15. [As a personal aside, I find the recognition that Alan Turing’s genius played a pivotal role in cracking the codes that helped us win the Second World War. It is therefore only right that our country’s top universities are chosen to lead this new institute named in his honour. by the Business Secretary does not absolve the legal system that drove Turing to suicide….]
Filed under: Books, pictures, Running, Statistics, University life Tagged: Alan Turing, Alan Turing Institute, British Library, London, UCL, United Kingdom, University of Cambridge, University of Edinburgh, University of Oxford, University of Warwick
This (early) summer, a conference on missing data will be organised in Rennes, Brittany, with the support of the French Statistical Society [SFDS]. (Check the website if interested, Rennes is a mere two hours from Paris by fast train.)
Filed under: R, Statistics, Travel, University life Tagged: Brittany, conference, France, missing data, Rennes, Roderick Little, TGV
I just arXived my comments about A. Ronald Gallant’s “Reflections on the Probability Space Induced by Moment Conditions with Implications for Bayesian Inference”, capitalising on the three posts I wrote around the discussion talk I gave at the 6th French Econometrics conference last year. Nothing new there, except that I may get a response from Ron Gallant as this is submitted as a discussion of his related paper in Journal of Financial Econometrics. While my conclusion is rather negative, I find the issue of setting prior and model based on a limited amount of information of much interest, with obvious links with ABC, empirical likelihood and other approximation methods.
Filed under: pictures, Statistics, University life Tagged: 6th French Econometrics conference, ABC, empirical likelihood, limited information inference, measure theory, moment prior, Ron Gallant
An arithmetics Le Monde mathematical puzzle:
For which n’s are the averages of the first n squared integers integers? Among those, which ones are perfect squares?
An easy R code, for instancen=10^3 car=as.integer(as.integer(1:n)^2) sumcar=as.integer((cumsum(car)%/%as.integer(1:n))) diff=as.integer(as.integer(cumsum(car))-as.integer(1:n)*sumcar) print((1:n)[diff==00])
which produces 333 values 1 5 7 11 13 17 19 23 25 29 31 35 37 41 43 47 49 53  55 59 61 65 67 71 73 77 79 83 85 89 91 95 97 101 103 107  109 113 115 119 121 125 127 131 133 137 139 143 145 149 151 155 157 161  163 167 169 173 175 179 181 185 187 191 193 197 199 203 205 209 211 215  217 221 223 227 229 233 235 239 241 245 247 251 253 257 259 263 265 269  271 275 277 281 283 287 289 293 295 299 301 305 307 311 313 317 319 323  325 329 331 335 337 341 343 347 349 353 355 359 361 365 367 371 373 377  379 383 385 389 391 395 397 401 403 407 409 413 415 419 421 425 427 431  433 437 439 443 445 449 451 455 457 461 463 467 469 473 475 479 481 485  487 491 493 497 499 503 505 509 511 515 517 521 523 527 529 533 535 539  541 545 547 551 553 557 559 563 565 569 571 575 577 581 583 587 589 593  595 599 601 605 607 611 613 617 619 623 625 629 631 635 637 641 643 647  649 653 655 659 661 665 667 671 673 677 679 683 685 689 691 695 697 701  703 707 709 713 715 719 721 725 727 731 733 737 739 743 745 749 751 755  757 761 763 767 769 773 775 779 781 785 787 791 793 797 799 803 805 809  811 815 817 821 823 827 829 833 835 839 841 845 847 851 853 857 859 863  865 869 871 875 877 881 883 887 889 893 895 899 901 905 907 911 913 917  919 923 925 929 931 935 937 941 943 947 949 953 955 959 961 965 967 971  973 977 979 983 985 989 991 995 997
which are made of all odd integers that are not multiple of 3. (I could have guessed the exclusion of even numbers since the numerator is always odd. Why are the triplets excluded, now?! Jean-Louis Fouley gave me the answer: the sum of squares is such that
and hence m must be odd and 2m+1 a multiple of 3, which excludes multiples of 3.)
with the final result> sum(scar==0)  2 > ((1:n)[diff==0])[scar==0]  1 337
since 38025=195² is a perfect square. (I wonder if there is a plain explanation for that result!)
Filed under: Books, Kids, Statistics, University life Tagged: arithmetics, Jean-Louis Fouley, Le Monde, mathematical puzzle, perfect square, R