## Bayesian News Feeds

### Dom Juan’s opening

**T**he opening lines of the Dom Juan plan by Molière, a play with highly subversive undertones about free will and religion. And this ode to tobacco that may get it banned in Australia, if the recent deprogramming of Bizet’s Carmen is setting a trend! *[Personal note to Andrew: neither Molière’s not my research are or were supported by a tobacco company! Although I am not 100% sure about Molière…]*

*“Quoi que puisse dire Aristote et toute la philosophie,* *il n’est rien d’égal au tabac: c’est la passion des honnêtes gens,* *et qui vit sans tabac n’est pas digne de vivre. Non seulement il* *réjouit et purge les cerveaux humains, mais encore il instruit* *les âmes à la vertu, et l’on apprend avec lui à devenir honnête homme.”*

Dom Juan, Molière, 1665

*[Whatever may be argued by Aristotle and the entire philosophy, there is nothing equal to tobacco; it is the passion of upright people, and whoever lives without tobacco does not deserve living. Not only it rejoices and purges human brains, but it also brings souls towards virtue, and teaches about becoming a gentleman.]*

Filed under: Books, Kids Tagged: 17th Century theatre, Aristotle, Australia, Bizet, Carmen, Dom Juan, French literature, Molière, opera, tobacco

### a mad afternoon!

**A**n insanely exciting final day and end to the 2015 Six Nations tournament! on the first game of the afternoon, Wales beat Italy in Rome by a sound 20-61!, turning them into likely champions. But then, right after, Ireland won against Scotland 10-40! In mythical Murrayfield. A feat that made them winners unless England won over France in Twickenham by at least 26 points. Which did not happen, in a completely demented rugby game, a game of antology where England dominated but France was much more inspired (if as messy as usual) than in the past games and fought fair and well, managing to loose 35-55 and hence block English victory of the Six Nations. Which can be considered as a victory of sorts…! Absolutely brilliant ending.

Filed under: pictures, Running, Travel Tagged: England, France, Ireland, Italy, Murrayfield, rugby, Six Nations 2015, Six Nations tournament, Twickenham, Wales

### more gray matters

### Gray matters [not much, truly]

**T**hrough the blog of Andrew Jaffe, Leaves on the Lines, I became aware of John Gray‘s tribune in The Guardian, “What scares the new atheists“. Gray’s central points against “campaigning” or “evangelical” atheists are that their claim to scientific backup is baseless, that they mostly express a fear about the diminishing influence of the liberal West, and that they cannot produce an alternative form of morality. The title already put me off and the beginning of the tribune just got worse, as it goes on and on about the eugenics tendencies of some 1930’s atheists and on how they influenced Nazi ideology. It is never a good sign in a debate when the speaker strives to link the opposite side with National Socialist ideas and deeds. Even less so in a supposedly philosophical tribune! (To add injury to insult, Gray also brings Karl Marx in the picture with a similar blame for ethnocentrism…)

*“What today’s freethinkers want is freedom from doubt.”*

Besides this fairly unpleasant use of demeaning rhetoric, I am bemused by the arguments in the tribune. Especially when considering they come from an academic philosopher. At their core, Gray’s arguments meet earlier ones, namely that atheism has all the characteristics of a religion, in particular when preaching or “proselytising”. Except that it cannot define its own rand of morality. And that western atheism is deeply dependent on Judeo-Christian values. The last point is not much arguable as the Greek origins of philosophy can attest. So calling in Nietzsche to the rescue is not exactly necessary. But the remainder of Gray’s discourse is not particularly coherent. If arguments for atheism borrow from the scientific discourse, it is because no rational argument or scientific experiment can contribute to support the existence of a deity. That pro-active atheists argue more visibly against religions is a reaction against the rise and demands of those religions. Similarly, that liberalism (an apparently oversold and illusory philosophy) and atheism seem much more related now than they were in the past can be linked with the growing number of tyranical regimes based upon religion. At last, the morality argument (that is rather convincingly turned upside down by Dawkins) does not sell that well. Societies have run under evolving sets of rules, all called morality, that can be seen as a constituent of human evolution: there is no reason to buy that those rules were and will all be acceptable solely on religious grounds. Morality and immorality are only such in the eye of the beholder (and the guy next door).

Filed under: Books, University life Tagged: atheism, free will, Islington, John Gray, Karl Marx, liberalism, London, Richard Dawkins, The Guardian, United Kingdom

### The synoptic problem and statistics [book review]

**A** book that came to me for review in CHANCE and that came completely unannounced is Andris Abakuks’ The Synoptic Problem and Statistics. “Unannounced” in that I had not heard so far of the synoptic problem. This problem is one of ordering and connecting the gospels in the New Testament, more precisely the “synoptic” gospels attributed to Mark, Matthew and Luke, since the fourth canonical gospel of John is considered by experts to be posterior to those three. By considering overlaps between those texts, some statistical inference can be conducted and the book covers (some of?) those statistical analyses for different orderings of ancestry in authorship. My overall reaction after a quick perusal of the book over breakfast (sharing bread and fish, of course!) was to wonder why there was no mention made of a more global if potentially impossible approach via a phylogeny tree considering the three (or more) gospels as current observations and tracing their unknown ancestry back just as in population genetics. Not because ABC could then be brought into the picture. Rather because it sounds to me (and to my complete lack of expertise in this field!) more realistic to postulate that those gospels were not written by a single person. Or at a single period in time. But rather that they evolve like genetic mutations across copies and transmission until they got a sort of official status.

*“Given the notorious intractability of the synoptic problem and the number of different models that are still being advocated, none of them without its deficiencies in explaining the relationships between the synoptic gospels, it should not be surprising that we are unable to come up with more definitive conclusions.” (p.181)*

The book by Abakuks goes instead through several modelling directions, from logistic regression using variable length Markov chains [to predict agreement between two of the three texts by regressing on earlier agreement] to hidden Markov models [representing, e.g., Matthew’s use of Mark], to various independence tests on contingency tables, sometimes bringing into the model an extra source denoted by Q. Including some R code for hidden Markov models. Once again, from my outsider viewpoint, this fragmented approach to the problem sounds problematic and inconclusive. And rather verbose in extensive discussions of descriptive statistics. Not that I was expecting a sudden Monty Python-like ray of light and booming voice to disclose the truth! Or that I crave for more p-values (some may be found hiding within the book). But I still wonder about the phylogeny… Especially since phylogenies are used in text authentication as pointed out to me by Robin Ryder for Chauncer’s Canterbury Tales.

Filed under: Books, R, Statistics, University life, Wines Tagged: ABC, Andris Abakuks, author identification, Bible studies, CHANCE, Geoffrey Chauncer, hidden Markov models, linguistics, Monty Python, New Testament, phylogenetic model, synoptic gospel, University of Canterbury

### Significance and artificial intelligence

**A**s my sorry excuse of an Internet provider has been unable to fix my broken connection for several days, I had more time to read and enjoy the latest Significance I received last week. Plenty of interesting entries, once again! Even though, faithful to my idiosyncrasies, I must definitely criticise the cover (but you may also skip till the end of the paragraph!): It shows a pile of exams higher than the page frame on a student table in a classroom and a vague silhouette sitting behind the exams. I do not know whether or not this is intentional but the silhouette has definitely been added to the original picture (and presumably the exams as well!), because the seat and blackboard behind this silhouette show through it. If this is intentional, does that mean that the poor soul grading this endless pile of exams has long turned into a wraith?! If not intentional, that’s poor workmanship for a magazine usually apt at making the most from the graphical side. (And then I could go on and on about the clearly independent choice of illustrations by the managing editor rather than the author(s) of the article…) End of the digression! Or maybe not because there also was an ugly graph from *Knowledge is Beautiful* about the causes of plane crashes that made pie-charts look great… Not that all the graphs in the book are bad, far from it!

*“The development of full artificial intelligence could spell the end of the human race.’ S. Hawkins*

The central theme of the magazine is artificial intelligence (and machine learning). A point I wanted to mention in a post following the recent doom-like messages of Gates and Hawking about AIs taking over humanity à la Blade Runner… or in Turing’s test. As if they had not already impacted our life so much and in so many ways. And no all positive or for the common good. Witness the ultra-fast codes on the stock market. Witness the self-replicating and modifying computer viruses. Witness the increasingly autonomous military drones. Or witness my silly Internet issue, where I cannot get hold of a person who can tell me what the problem is and what the company is doing to solve it (if anything!), but instead have to listen to endless phone automata that tell me to press “1 if…” and “3 else”, and that my incident ticket has last been updated three days ago… But at the same time the tone of The Independent tribune by Hawking, Russell, Tegmark, and Wilczek is somewhat misguided, if I may object to such luminaries!, and playing on science fiction themes that have been repeated so many times that they are now ingrained, rather than strong scientific arguments. Military robots that could improve themselves to the point of evading their conceptors are surely frightening but much less realistic than a nuclear reaction that could not be stopped in a Fukushima plant. Or than the long-term impacts of genetically modified crops and animals. Or than the current proposals of climate engineering. Or than the emerging nano-particles.

*“If we build systems that are game-theoretic or utility maximisers, we won’t get what we’re hoping for.” P. Norvig*

The discussion of this scare in Significance does not contribute much in my opinion. It starts with the concept of a perfect Bayesian agent, supposedly the state of an AI creating paperclips, which (who?) ends up using the entire Earth’s resources to make more paperclips. The other articles in this cover story are more relevant, as for instance how AI moved from pure logic to statistical or probabilist intelligence. With Yee Whye Teh discussing Bayesian networks and the example of Google translation (including a perfect translation into French of an English sentence).

Filed under: Books, Kids, pictures, Statistics, University life Tagged: artificial intelligence, bad graph, Bayesian network, Blade Runner, computer virus, cover, exams, Fukushima Daïshi, genetically modified crops, Knowledge is Beautiful, machine learning, nanoparticles, Stephen Hawking, Turing's test

### the vim cheat sheet

### solution manual for Bayesian Essentials with R

The solution manual to our *Bayesian Essentials with R* has just been arXived. If I link this completion with the publication date of the book itself, it sure took an unreasonable time to come out and sadly with no obvious reason or even less justification for the delay… Given the large overlap with the solution manual of the previous edition, *Bayesian Core*, this version should have been completed much much earlier but, paradoxically if in-line with the lengthy completion of the book istelf, this previous manual is one of the causes for the delay, as we thought the overlap allowed for self-study readers to check some of the exercises. Prodded by Hannah Bracken from Springer-Verlag, and unable to hire an assistant towards this task, I eventually decided to spend the few days required to clean up this solution manual, with the unintentional help from my sorry excuse for an Internet provider who accidentally cutting my home connection for a whole week so far…!

In the course of writing solutions, I stumbled upon one inexplicably worded exercise about the Lemer-Schur algorithm for testing stationarity, exercise that I had to rewrite from scratch. Apologies to any reader of *Bayesian Essentials with R* getting stuck on that exercise!!!

Filed under: Books, Kids, Statistics, University life Tagged: Bayesian Core, Bayesian Essentials with R, lag polynomial, Lemer-Schur algorithm, solution manual, Springer-Verlag, stationarity

### Turing’s Bayesian contributions

**F**ollowing The Imitation Game, this recent movie about Alan Turing played by Benedict “Sherlock” Cumberbatch, been aired in French theatres, one of my colleagues in Dauphine asked me about the Bayesian contributions of Turing. I first tried to check in Sharon McGrayne‘s book, but realised it had vanished from my bookshelves, presumably lent to someone a while ago. *(Please return it at your earliest convenience!)* So I told him about the Bayesian principle of updating priors with data and prior probabilities with likelihood evidence in code detecting algorithms and ultimately machines at Bletchley Park… I could not got much farther than that and hence went checking on Internet for more fodder.

*“Turing was one of the independent inventors of sequential analysis for which he naturally made use of the logarithm of the Bayes factor.” (p.393)*

I came upon a few interesting entries but the most amazìng one was a 1979 note by I.J. Good (assistant of Turing during the War) published in *Biometrika* retracing the contributions of Alan Mathison Turing during the War. From those few pages, it emerges that Turing’s statistical ideas revolved around the Bayes factor that Turing used “without the qualification `Bayes’.” (p.393) He also introduced the notion of ban as a unit for the weight of evidence, in connection with the town of Banbury (UK) where specially formatted sheets of papers were printed “for carrying out an important classified process called Banburismus” (p.394). Which shows that even in 1979, Good did not dare to get into the details of Turing’s work during the War… And explains why he was testing simple statistical hypothesis against simple statistical hypothesis. Good also credits Turing for the expected weight of evidence, which is another name for the Kullback-Leibler divergence and for Shannon’s information, whom Turing would visit in the U.S. after the War. In the final sections of the note, Turing is also associated with Gini’s index, the estimation of the number of species (processed by Good from Turing’s suggestion in a 1953 Biometrika paper, that is, prior to Turing’s suicide. In fact, Good states in this paper that “a very large part of the credit for the present paper should be given to [Turing]”, p.237), and empirical Bayes.

Filed under: Books, Kids, pictures, Running, Statistics, University life Tagged: Alan Turing, Banbury, Biometrika, Bletchley Park, Cryptonomicon, England, Enigma code machine, I.J. Good, Kullback-Leibler divergence, missing species problem, Shannonś information, statistical evidence, WW II

### Statistics done wrong [book review]

no starch press (!) sent me the pdf version of this incoming book, *Statistics done wrong*, by Alex Reinhart, towards writing a book review for CHANCE, and I read it over two flights, one from Montpellier to Paris last week, and from Paris to B’ham this morning. The book is due to appear on March 16. It expands on a still existing website developed by Reinhart. (Discussed a year or so away on Andrew’s blog, most in comments, witness Andrew’s comment below.) Reinhart who is, incidentally or not, is a PhD candidate in statistics at Carnegie Mellon University. After apparently a rather consequent undergraduate foray into physics. Quite an unusual level of maturity and perspective for a PhD student..!

* “It’s hard for me to evaluate because I am so close to the material. But on first glance it looks pretty reasonable to me.” A. Gelman*

Overall, I found myself enjoying reading the book, even though I found the overall picture of the infinitely many mis-uses of statistics rather grim and a recipe for despairing of ever setting things straight..! Somehow, this is an anti-textbook, in that it warns about many ways of applying the right statistical technique in the wrong setting, without ever describing those statistical techniques. Actually without using a single maths equation. Which should be a reason good enough for me to let all hell break loose on that book! But, no, not really, I felt no compunction about agreeing with Reinhart’s warning and if you have reading Andrew’s blog for a while you should feel the same…

*“Then again for a symptom like spontaneous human combustion you might get excited about any improvement.” A. Reinhart (p.13)*

Maybe the limitation in the exercise is that statistics appears so much fraught with dangers of over-interpretation and false positive and that everyone (except physicists!) is bound to make such invalidated leaps in conclusion, willingly or not, that it sounds like the statistical side of Gödel’s impossibility theorem! Further, the book moves from recommendation at the individual level, i.e., on how one should conduct an experiment and separate data for hypothesis building from data for hypothesis testing, to a universal criticism of the poor standards of scientific publishing and the unavailability of most datasets and codes. Hence calling for universal reproducibility protocols that reminded of the directions explored in this recent book I reviewed on that topic. (The one the rogue bird did not like.) It may be missing on the bright side of things, for instance the wonderful possibility to use statistical models to produce simulated datasets that allow for an evaluation of the performances of a given procedure in the ideal setting. Which would have helped the increasingly depressed reader in finding ways of checking how wrongs things could get..! But also on the dark side, as it does not say much about the fact that a statistical model is most presumably wrong. (Maybe a physicist’s idiosyncrasy!) There is a chapter entitled Model Abuse, but all it does is criticise stepwise regression and somehow botches the description of Simpson’s paradox.

*“You can likely get good advice in exchange for some chocolates or a beer or perhaps coauthorship on your next paper.” A. Reinhart (p.127)*

The final pages are however quite redeeming in that they acknowledge that scientists from other fields cannot afford a solid enough training in statistics and hence should hire statisticians as consultants for the data collection, analysis and interpretation of their experiments. A most reasonable recommendation!

Filed under: Books, Kids, pictures, Statistics, University life Tagged: Andrew Gelman, book reviews, Carnegie Mellon University, CHANCE, consulting, no starch press, p-values, physics, Statistics done wrong, textbook

### at The X

Filed under: pictures, Running, Travel, Wines Tagged: England, Italian wines, Kenilworth, MIchelin starred restaurant, Primitivo, Puglia, The Cross, University of Warwick

### Sherlock [#3]

**A**fter watching the first two seasons of the BBC TV Series Sherlock while at the hospital, I found myself looking forward further adventures of Holmes and Watson and eventually “bought” the third season. And watched it over the past weekends. I liked it very much as this new season distanced itself from the sheer depiction of Sherlock’s amazing powers to a quite ironic and self-parodic story, well in tune with a third season where the audience is now utterly familiar with the main characters. They all put on weight (mostly figuratively!), from Sherlock’s acknowledgement of his psychological shortcomings, to Mrs. Hudson’s revealing her drug trafficking past and expressing her dislike of Mycroft, to John Watson’s engagement and acceptance of Sherlock’s idiosyncrasies, making him the central character of the series in a sort of fatherly figure. Some new characters are also terrific, including Mary Morstan and the new archvillain, C.A. Magnussen. Paradoxically, this makes the detective part of the stories secondary, which is all for the best as, in my opinion, the plots are rather weak and the resolutions hardly relying on high intellectual powers, albeit always surprising. More sleuthing in the new season would be most welcome! As an aside, the wedding place sounded somewhat familiar to me, until I realised it was Goldney Hall, where the recent workshops I attended in Bristol took place.

Filed under: Books Tagged: amazon associates, BBC, Bristol, Conan Doyle, Goldney Hall, Sherlock Holmes, TV series

### Mort de Terry Pratchett (1948-2015)

Filed under: Books, Kids, pictures Tagged: Death, Discworld, fantasy, Josh Kirby, Mort, Noli Timere Messorem, Terry Pratchett, The Reaper

### Hamiltonian ABC

**O**n Monday, Ed Meeds, Robert Leenders, and Max Welling (from Amsterdam) arXived a paper entitled Hamiltonian ABC. Before looking at the paper in any detail, I got puzzled by this association of antagonistic terms, since ABC is intended for complex and mostly intractable likelihoods, while Hamiltonian Monte Carlo requires a lot from the target, in order to compute gradients and Hessians… *[Warning: some graphs on pages 13-14 may be harmful to your printer!]*

Somewhat obviously (ex-post!), the paper suggests to use Hamiltonian dynamics on ABC approximations of the likelihood. They compare a Gaussian kernel version

with the synthetic Gaussian likelihood version of Wood (2010)

where both mean and variance are estimated from the simulated data. If ε is taken as an external quantity and driven to zero, the second approach is much more stable. But… ε is never driven to zero in ABC, or fixed at ε=0.37: It is instead considered as a kernel bandwidth and hence estimated from the simulated data. Hence ε is commensurable with σ(θ). And this makes me wonder at the relevance of the conclusion that synthetic is better than kernel for Hamiltonian ABC. More globally, I wonder at the relevance of better simulating from a still approximate target when the true goal is to better approximate the genuine posterior.

Some of the paper covers separate issues like handling gradient by finite differences à la Spall *[if you can afford it!]* and incorporating the random generator as part of the Markov chain. And using S *common* random numbers in computing the gradients for all values of θ. (Although I am not certain all random generators can be represented as a deterministic transform of a parameter θ and of a fixed number of random uniforms. But the authors may consider a random number of random uniforms when they represent their random generators as deterministic transform of a parameter θ and of the random seed. I am also uncertain about the distinction between common, sticky, and persistent random numbers!)

Filed under: Books, pictures, Statistics, University life Tagged: ABC, Amsterdam, Hamiltonian Monte Carlo, Markov chain, Monte Carlo Statistical Methods, pseudo-random generator, random seed, synthetic likelihood

### eliminating an important obstacle to creative thinking: statistics…

*“We hope and anticipate that banning the NHSTP will have the effect of increasing the quality of submitted manuscripts by liberating authors from the stultified structure of NHSTP thinking thereby eliminating an important obstacle to creative thinking.”*

**A**bout a month ago, David Trafimow and Michael Marks, the current editors of the journal *Basic and Applied Social Psychology* published an editorial banning all null hypothesis significance testing procedures (acronym-ed into the ugly NHSTP which sounds like a particularly nasty venereal disease!) from papers published by the journal. My first reaction was “Great! This will bring more substance to the papers by preventing significance fishing and undisclosed multiple testing! Power to the statisticians!” However, after reading the said editorial, I realised it was inspired by a nihilistic anti-statistical stance, backed by an apparent lack of understanding of the nature of statistical inference, rather than a call for saner and safer statistical practice. The editors most clearly state that inferential statistical procedures are no longer needed to publish in the journal, only “strong descriptive statistics”. Maybe to keep in tune with the “Basic” in the name of the journal!

*“In the NHSTP, the problem is in traversing the distance from the probability of the finding, given the null hypothesis, to the probability of the null hypothesis, given the finding. Regarding confidence intervals, the problem is that, for example, a 95% confidence interval does not indicate that the parameter of interest has a 95% probability of being within the interval.”*

The above quote could be a motivation for a Bayesian approach to the testing problem, a revolutionary stance for journal editors!, but it only illustrate that the editors wish for a procedure that would eliminate the uncertainty inherent to statistical inference, i.e., to decision making under… erm, uncertainty: *“The state of the art remains uncertain.”* To fail to separate significance from certainty is fairly appalling from an epistemological perspective and should be a case for impeachment, were any such thing to exist for a journal board. This means the editors cannot distinguish data from parameter and model from reality! Even more fundamentally, to bar statistical procedures from being used in a scientific study is nothing short of reactionary. While encouraging the inclusion of data is a step forward, restricting the validation or in-validation of hypotheses to gazing at descriptive statistics is many steps backward and does completely jeopardize the academic reputation of the journal, which editorial may end up being the last quoted paper. Is deconstruction now reaching psychology journals?! To quote from a critic of this approach, “Thus, the general weaknesses of the deconstructive enterprise become self-justifying. With such an approach I am indeed not sympathetic.” (Searle, 1983).

*“The usual problem with Bayesian procedures is that they depend on some sort of Laplacian assumption to generate numbers where none exist (…) With respect to Bayesian procedures, we reserve the right to make case-by-case judgments, and thus Bayesian procedures are neither required nor banned from BASP.”*

The section of Bayesian approaches is trying to be sympathetic to the Bayesian paradigm but again reflects upon the poor understanding of the authors. By “Laplacian assumption”, they mean Laplace´s Principle of Indifference, i.e., the use of uniform priors, which is not seriously considered as a sound principle since the mid-1930’s. Except maybe in recent papers of Trafimow. I also love the notion of “generat[ing] numbers when none exist”, as if the prior distribution had to be grounded in some physical reality! Although it is meaningless, it has some poetic value… (Plus, bringing Popper and Fisher to the rescue sounds like shooting Bayes himself in the foot.) At least, the fact that the editors will consider Bayesian papers in a case-by-case basis indicate they may engage in a subjective Bayesian analysis of each paper rather than using an automated p-value against the 100% rejection bound!

*[Note: this entry was suggested by Alexandra Schmidt, current ISBA President, towards an incoming column on this decision of Basic and Applied Social Psychology for the ISBA Bulletin.]*

Filed under: Books, Kids, Statistics, University life Tagged: Basic and Applied Social Psychology, Bayesian hypothesis testing, confidence intervals, editor, ISBA, ISBA Bulletin, Karl Popper, NHSTP, null hypothesis, p-values, Pierre Simon de Laplace, Principle of Indifference, Thomas Bayes, xkcd

### Edmond Malinvaud (1923-2015)

**T**he statistician, econometrician, macro- and micro-economist, Edmond Malinvaud died on Saturday, March 7. He had been director of my alma mater ENSAE (1962–1966), directeur de la Prévision at the Finance Department (1972–1974), director of INSEE (1974–1987), and Professeur at Collège de France (1988–1993). While primarily an economist, with his theories of disequilibrium and unemployment, reflected in his famous book Théorie macro-économique (1981) that he taught us at ENSAE, he was also instrumental in shaping the French econometrics school, see his equally famous Statistical Methods of Econometrics (1970), and in the reorganisation of INSEE as the post-war State census and economic planning tool. He was also an honorary Fellow of the Royal Statistical Society and the 1981 president of the International Institute of Statistics. Edmond Malinvaud studied under Maurice Allais, Nobel Prize in economics in 1988, and was himself considered as a potential Nobel for several years. My personal memories of him at ENSAE and CREST are of a very clear teacher and of a kind and considerate man, with the reserve and style of a now-bygone era…

Filed under: Books, Kids, Statistics, University life Tagged: Collège de France, CREST, disequilibrium, econometrics, Edmond Malinvaud, ENSAE, INSEE, macroeconomics, Maurice Allais

### Compound Poisson Processes, Latent Shrinkage Priors and Bayesian Nonconvex Penalization

**Zhihua Zhang**,

**Jin Li**.

**Source: **Bayesian Analysis, Volume 10, Number 2, 247--274.

**Abstract:**

In this paper we discuss Bayesian nonconvex penalization for sparse learning problems. We explore a nonparametric formulation for latent shrinkage parameters using subordinators which are one-dimensional Lévy processes. We particularly study a family of continuous compound Poisson subordinators and a family of discrete compound Poisson subordinators. We exemplify four specific subordinators: Gamma, Poisson, negative binomial and squared Bessel subordinators. The Laplace exponents of the subordinators are Bernstein functions, so they can be used as sparsity-inducing nonconvex penalty functions. We exploit these subordinators in regression problems, yielding a hierarchical model with multiple regularization parameters. We devise ECME (Expectation/Conditional Maximization Either) algorithms to simultaneously estimate regression coefficients and regularization parameters. The empirical evaluation of simulated data shows that our approach is feasible and effective in high-dimensional data analysis.

### Dirichlet Process Hidden Markov Multiple Change-point Model

**Stanley I. M. Ko**,

**Terence T. L. Chong**,

**Pulak Ghosh**.

**Source: **Bayesian Analysis, Volume 10, Number 2, 275--296.

**Abstract:**

This paper proposes a new Bayesian multiple change-point model which is based on the hidden Markov approach. The Dirichlet process hidden Markov model does not require the specification of the number of change-points a priori . Hence our model is robust to model specification in contrast to the fully parametric Bayesian model. We propose a general Markov chain Monte Carlo algorithm which only needs to sample the states around change-points. Simulations for a normal mean-shift model with known and unknown variance demonstrate advantages of our approach. Two applications, namely the coal-mining disaster data and the real United States Gross Domestic Product growth, are provided. We detect a single change-point for both the disaster data and US GDP growth. All the change-point locations and posterior inferences of the two applications are in line with existing methods.

### Two-sample Bayesian Nonparametric Hypothesis Testing

**Chris C. Holmes**,

**François Caron**,

**Jim E. Griffin**,

**David A. Stephens**.

**Source: **Bayesian Analysis, Volume 10, Number 2, 297--320.

**Abstract:**

In this article we describe Bayesian nonparametric procedures for two-sample hypothesis testing. Namely, given two sets of samples
$\mathbf{y}^{\scriptscriptstyle(1)}\stackrel{\scriptscriptstyle{\text{iid}}}{\sim}F^{\scriptscriptstyle(1)}$ and $\mathbf{y}^{\scriptscriptstyle(2)}\stackrel{\scriptscriptstyle{\text{iid}}}{\sim}F^{\scriptscriptstyle(2)}$ , with $F^{\scriptscriptstyle(1)},F^{\scriptscriptstyle(2)}$
unknown, we wish to evaluate the evidence for the null hypothesis $H_{0}:F^{\scriptscriptstyle(1)}\equiv F^{\scriptscriptstyle(2)}$ versus the alternative $H_{1}:F^{\scriptscriptstyle(1)}\neq F^{\scriptscriptstyle(2)}$ . Our method is based upon a nonparametric Pólya tree prior centered either subjectively or using an empirical procedure. We show that the Pólya tree prior leads to an analytic expression for the marginal likelihood under the two hypotheses and hence an explicit measure of the probability of the null $\mathrm{Pr}(H_{0}|\{\mathbf{y}^{\scriptscriptstyle(1)},\mathbf{y}^{\scriptscriptstyle(2)}\}\mathbf{)}$