## Xian's Og

### June 7, 1944

*[I wrote this post a few years ago, but the 70th anniversary of the D-day brought back those memories and I thought it worth re-posting...]*

**T**his is the day I almost got un-born, not that I was born at the time (!) but my mother, then almost seven, came close to dying under the Allied bombs that obliterated Saint-Lô (Manche, western France) from the map that night, in conjunction with the D Day landing in the nearby beaches of Utah Beach and Omaha Beach. (The city was supposed to be taken by the end of June 6, but it was only on July 19 that Allied troops entered Saint-Lô.) Most of the town got destroyed under 60,000 pounds of bombs in an attempt by the Allied forces to cut access to the beaches from German reinforcements from Brittany. (Saint-Lô got the surname of “capital of the ruins” from Samuel Beckett after this bombing and it took many years to reconstruct.) My granparents and their three daughters barely went out of their house before it collapsed and had to flee the ablaze Saint-Lô with a single cartwheel to carry two suitcases and the three girls. Several times did my grandfather hide them under his leather jacket for power lines were collapsing around them…

**T**hey eventually (and obviously) made it alive out of Saint-Lô, only to be rounded up with other refugees by German troops who parked them in a field, most likely to be used as hostages. Taking advantage of the night, my grandfather managed once again to get his family away by crawling under the barriers on the darkest side of the field and they then reached (by foot) a most secluded village in the countryside where my great-grandmother was living at the time. From when I was a child, I have heard this story so many times from my mother that it is almost pictured in my brain, as if I had seen the “movie”, somehow.

Filed under: Kids Tagged: Allied troops, bombing, capital of the ruins, D Day, Saint-Lô, WW II

### computational methods for statistical mechanics [day #4]

**M**y last day at this ICMS workshop on molecular simulation started [with a double loop of Arthur's Seat thankfully avoiding the heavy rains of the previous night and then] Chris Chipot‘s magistral entry to molecular simulation for proteins with impressive slides and simulation movies, even though I could not follow the details to really understand the simulation challenges therein, just catching a few connections with earlier talks. A typical example of a cross-disciplinary gap, where the other discipline always seems to be stressing the ‘wrong” aspects. Although this is perfectly unrealistic, it would immensely to prepare talks in pairs for such interdisciplinary workshops! Then Gersende Fort presented results about convergence and efficiency for the Wang-Landau algorithm. The idea is to find the optimal rate for updating the weights of the elements of the partition towards reaching the flat histogram in minimal time. Showing massive gains on toy examples. The next talk went back to molecular biology with Jérôme Hénin‘s presentation on improved adaptive biased sampling. With an exciting notion of orthogonality aiming at finding the slowest directions in the target and putting the computational effort. He also discussed the tension between long single simulations and short repeated ones, echoing a long-going debate in the MCMC community. (He also had a slide with a picture of my first 1983 Apple IIe computer!) Then Antonietta Mira gave a broad perspective on delayed rejection and zero variance estimates. With impressive variance reductions (although some physicists then asked for reduction of order 10¹⁰!). Johannes Zimmer gave a beautiful maths talk on the connection between particle and diffusion limits (PDEs) and Wasserstein geometry and large deviations. (I did not get most of the talk, but it was nonetheless beautiful!) Bert Kappen concluded the day (and the workshop for me) by a nice introduction to control theory. Making connection between optimal control and optimal importance sampling. Which made me idly think of the following problem: what if control cannot be completely… controlled and hence involves a stochastic part? Presumably of little interest as the control would then be on the parameters of the distribution of the control.

*“The alanine dipeptide is the fruit fly of molecular simulation.”*

**T**he example of this alanine dipeptide molecule was so recurrent during the talks that it justified the above quote by Michael Allen. Not that I am more proficient in the point of studying this protein or using it as a benchmark. Or in identifying the specifics of the challenges of molecular dynamics simulation. Not a criticism of the ICMS workshop obviously, but rather of my congenital difficulty with continuous time processes!!! So I do not return from Edinburgh with a new research collaborative project in molecular dynamics (if with more traditional prospects), albeit with the perception that a minimal effort could bring me to breach the vocabulary barrier. And maybe consider ABC ventures in those (new) domains. (Although I fear my talk on ABC did not impact most of the audience!)

Filed under: Mountains, pictures, Running, Statistics, Travel, University life Tagged: ABC, Arthur's Seat, computational physics, control theory, control variate, delayed rejection sampling, Edinburgh, Highlands, ICMS, Langevin diffusion, large deviation, MCMC, molecular simulation, Monte Carlo Statistical Methods, Scotland, Wasserstein distance, zero variance importance sampling

### computational methods for statistical mechanics [day #4]

**M**y last day at this ICMS workshop on molecular simulation started [with a double loop of Arthur's Seat thankfully avoiding the heavy rains of the previous night and then] Chris Chipot‘s magistral entry to molecular simulation for proteins with impressive slides and simulation movies, even though I could not follow the details to really understand the simulation challenges therein, just catching a few connections with earlier talks. A typical example of a cross-disciplinary gap, where the other discipline always seems to be stressing the ‘wrong” aspects. Although this is perfectly unrealistic, it would immensely to prepare talks in pairs for such interdisciplinary workshops! Then Gersende Fort presented results about convergence and efficiency for the Wang-Landau algorithm. The idea is to find the optimal rate for updating the weights of the elements of the partition towards reaching the flat histogram in minimal time. Showing massive gains on toy examples. The next talk went back to molecular biology with Jérôme Hénin‘s presentation on improved adaptive biased sampling. With an exciting notion of orthogonality aiming at finding the slowest directions in the target and putting the computational effort. He also discussed the tension between long single simulations and short repeated ones, echoing a long-going debate in the MCMC community. (He also had a slide with a picture of my first 1983 Apple IIe computer!) Then Antonietta Mira gave a broad perspective on delayed rejection and zero variance estimates. With impressive variance reductions (although some physicists then asked for reduction of order 10¹⁰!). Johannes Zimmer gave a beautiful maths talk on the connection between particle and diffusion limits (PDEs) and Wasserstein geometry and large deviations. (I did not get most of the talk, but it was nonetheless beautiful!) Bert Kappen concluded the day (and the workshop for me) by a nice introduction to control theory. Making connection between optimal control and optimal importance sampling. Which made me idly think of the following problem: what if control cannot be completely… controlled and hence involves a stochastic part? Presumably of little interest as the control would then be on the parameters of the distribution of the control.

*“The alanine dipeptide is the fruit fly of molecular simulation.”*

**T**he example of this alanine dipeptide molecule was so recurrent during the talks that it justified the above quote by Michael Allen. Not that I am more proficient in the point of studying this protein or using it as a benchmark. Or in identifying the specifics of the challenges of molecular dynamics simulation. Not a criticism of the ICMS workshop obviously, but rather of my congenital difficulty with continuous time processes!!! So I do not return from Edinburgh with a new research collaborative project in molecular dynamics (if with more traditional prospects), albeit with the perception that a minimal effort could bring me to breach the vocabulary barrier. And maybe consider ABC ventures in those (new) domains. (Although I fear my talk on ABC did not impact most of the audience!)

Filed under: Mountains, pictures, Running, Statistics, Travel, University life Tagged: ABC, Arthur's Seat, computational physics, control theory, control variate, delayed rejection sampling, Edinburgh, Highlands, ICMS, Langevin diffusion, large deviation, MCMC, molecular simulation, Monte Carlo Statistical Methods, Scotland, Wasserstein distance, zero variance importance sampling

### Ediburgh snapshot (#3)

### Ediburgh snapshot (#3)

### computational methods for statistical mechanics [day #3]

**T**he third day [morn] at our ICMS workshop was dedicated to path sampling. And rare events. Much more into [my taste] Monte Carlo territory. The first talk by Rosalind Allen looked at reweighting trajectories that are not in an equilibrium or are missing the Boltzmann [normalizing] constant. Although the derivation against a calibration parameter looked like the primary goal rather than the tool for constant estimation. Again papers in *J. Chem. Phys.*! And a potential link with ABC raised by Antonietta Mira… Then Jonathan Weare discussed stratification. With a nice trick of expressing the normalising constants of the different terms in the partition as solution(s) of a Markov system

Because the stochastic matrix **M** is easier (?) to approximate. Valleau’s and Torrie’s umbrella sampling was a constant reference in this morning of talks. Arnaud Guyader’s talk was in the continuation of Toni Lelièvre’s introduction, which helped a lot in my better understanding of the concepts. Rephrasing things in more statistical terms. Like the distinction between equilibrium and paths. Or bias being importance sampling. Frédéric Cérou actually gave a sort of second part to Arnaud’s talk, using importance splitting algorithms. Presenting an algorithm for simulating rare events that sounded like an opposite nested sampling, where the goal is to get *down* the target, rather than *up*. Pushing particles away from a current level of the target function with probability ½. Michela Ottobre completed the series with an entry into diffusion limits in the Roberts-Gelman-Gilks spirit when the Markov chain is not yet stationary. In the transient phase thus.

Filed under: Mountains, pictures, Running, Statistics, Travel, University life Tagged: ABC, Arthur's Seat, computational physics, Edinburgh, extreme value theory, Highlands, ICMS, MCMC, molecular simulation, Monte Carlo Statistical Methods, NIPS 2014, path sampling, rare events, Scotland, stratification

### computational methods for statistical mechanics [day #3]

**T**he third day [morn] at our ICMS workshop was dedicated to path sampling. And rare events. Much more into [my taste] Monte Carlo territory. The first talk by Rosalind Allen looked at reweighting trajectories that are not in an equilibrium or are missing the Boltzmann [normalizing] constant. Although the derivation against a calibration parameter looked like the primary goal rather than the tool for constant estimation. Again papers in *J. Chem. Phys.*! And a potential link with ABC raised by Antonietta Mira… Then Jonathan Weare discussed stratification. With a nice trick of expressing the normalising constants of the different terms in the partition as solution(s) of a Markov system

Because the stochastic matrix **M** is easier (?) to approximate. Valleau’s and Torrie’s umbrella sampling was a constant reference in this morning of talks. Arnaud Guyader’s talk was in the continuation of Toni Lelièvre’s introduction, which helped a lot in my better understanding of the concepts. Rephrasing things in more statistical terms. Like the distinction between equilibrium and paths. Or bias being importance sampling. Frédéric Cérou actually gave a sort of second part to Arnaud’s talk, using importance splitting algorithms. Presenting an algorithm for simulating rare events that sounded like an opposite nested sampling, where the goal is to get *down* the target, rather than *up*. Pushing particles away from a current level of the target function with probability ½. Michela Ottobre completed the series with an entry into diffusion limits in the Roberts-Gelman-Gilks spirit when the Markov chain is not yet stationary. In the transient phase thus.

Filed under: Mountains, pictures, Running, Statistics, Travel, University life Tagged: ABC, Arthur's Seat, computational physics, Edinburgh, extreme value theory, Highlands, ICMS, MCMC, molecular simulation, Monte Carlo Statistical Methods, NIPS 2014, path sampling, rare events, Scotland, stratification

### Edinburgh snapshot (#2)

### Edinburgh snapshot (#2)

### computational methods for statistical mechanics [day #2]

**T**he last “tutorial” talk at this ICMS workshop ["at the interface between mathematical statistics and molecular simulation"] was given by Tony Lelièvre on adaptive bias schemes in Langevin algorithms and on the parallel replica algorithm. This was both very interesting because of the potential for connections with my “brand” of MCMC techniques and rather frustrating as I felt the intuition behind the physical concepts like free energy and metastability was almost within my reach! The most manageable time in Tony’s talk was the illustration of the concepts through a mixture posterior example. Example that I need to (re)read further to grasp the general idea. (And maybe the book on Free Energy Computations Tony wrote with Mathias Rousset et Gabriel Stoltz.) A definitely worthwhile talk that I hope will get posted on line by ICMS. The other talks of the day were mostly of a free energy nature, some using optimised bias in the Langevin diffusion (except for Pierre Jacob who presented his non-negative unbiased estimation impossibility result).

Filed under: Mountains, pictures, Running, Statistics, Travel, University life Tagged: ABC, Arthur's Seat, computational physics, Edinburgh, free energy, ICMS, MCMC, molecular simulation, Monte Carlo Statistical Methods, NIPS 2014, Scotland, unbiasedness

### computational methods for statistical mechanics [day #2]

**T**he last “tutorial” talk at this ICMS workshop ["at the interface between mathematical statistics and molecular simulation"] was given by Tony Lelièvre on adaptive bias schemes in Langevin algorithms and on the parallel replica algorithm. This was both very interesting because of the potential for connections with my “brand” of MCMC techniques and rather frustrating as I felt the intuition behind the physical concepts like free energy and metastability was almost within my reach! The most manageable time in Tony’s talk was the illustration of the concepts through a mixture posterior example. Example that I need to (re)read further to grasp the general idea. (And maybe the book on Free Energy Computations Tony wrote with Mathias Rousset et Gabriel Stoltz.) A definitely worthwhile talk that I hope will get posted on line by ICMS. The other talks of the day were mostly of a free energy nature, some using optimised bias in the Langevin diffusion (except for Pierre Jacob who presented his non-negative unbiased estimation impossibility result).

Filed under: Mountains, pictures, Running, Statistics, Travel, University life Tagged: ABC, Arthur's Seat, computational physics, Edinburgh, free energy, ICMS, MCMC, molecular simulation, Monte Carlo Statistical Methods, NIPS 2014, Scotland, unbiasedness

### Edinburgh snapshot (#1)

### Edinburgh snapshot (#1)

### computational methods for statistical mechanics [day #1]

**T**he first talks of the day at this ICMS workshop ["at the interface between mathematical statistics and molecular simulation"] were actually lectures introducing molecular simulation to statisticians by Michael Allen from Warwick and computational statistics to physicists by Omiros Papaspiliopoulos. Allen’s lecture was quite pedagogical, even though I had to quiz wikipedia for physics terms and notions. Like a force being the gradient of a potential function. He gave a physical meaning to Langevin’ equation. As well as references from the *Journal of Chemical Physics* that were more recent than 1953. He mentioned alternatives to Langevin’s equation too and I idly wondered at the possibility of using those alternatives as other tools for improved MCMC simulation. Although introducing friction may not be the most promising way to speed up the thing… He later introduced what statisticians call Langevin’ algorithm (MALA) as smart Monte Carlo (Rossky et al., …1978!!!). Recovering Hamiltonian and hybrid Monte Carlo algorithms as a fusion of molecular dynamics, Verlet algorithm, and Metropolis acceptance step! As well as reminding us of the physics roots of umbrella sampling and the Wang-Landau algorithm.

**O**miros Papaspiliopoulos also gave a very pedagogical entry to the convergence of MCMC samplers which focussed on the L² approach to convergence. This reminded me of the very first papers published on the convergence of the Gibbs sampler, like the 1990 1992 JCGS paper by Schervish and Carlin. Or the 1991 1996 Annals of Statistics by Amit. (Funny that I located both papers much earlier than when they actually appeared!) One surprising fact was that the convergence of all reversible ergodic kernels is necessarily geometric. There is no classification of kernels in this topology, the only ranking being through the respective spectral gaps. A good refresher for most of the audience, statisticians included.

**T**he following talks of Day 1 were by Christophe Andrieu, who kept with the spirit of a highly pedagogical entry, covering particle filters, SMC, particle Gibbs and pseudo-marginals, and who hit the right tone I think given the heterogeneous audience. And by Ben Leimkuhler about particle simulation for very large molecular structures. Closing the day by focussing on Langevin dynamics. What I understood from the talk was an improved entry into the resolution of some SPDEs. Gaining two orders when compared with Euler-Marayama. But missed the meaning of the friction coefficient γ converging to infinity in the title…

Filed under: Mountains, pictures, Running, Statistics, Travel, University life Tagged: ABC, Arthur's Seat, computational physics, Edinburgh, Hamburg, Highlands, ICMS, MCMC, molecular simulation, Monte Carlo Statistical Methods, munroes, NIPS 2014, Scotland

### computational methods for statistical mechanics [day #1]

**T**he first talks of the day at this ICMS workshop ["at the interface between mathematical statistics and molecular simulation"] were actually lectures introducing molecular simulation to statisticians by Michael Allen from Warwick and computational statistics to physicists by Omiros Papaspiliopoulos. Allen’s lecture was quite pedagogical, even though I had to quiz wikipedia for physics terms and notions. Like a force being the gradient of a potential function. He gave a physical meaning to Langevin’ equation. As well as references from the *Journal of Chemical Physics* that were more recent than 1953. He mentioned alternatives to Langevin’s equation too and I idly wondered at the possibility of using those alternatives as other tools for improved MCMC simulation. Although introducing friction may not be the most promising way to speed up the thing… He later introduced what statisticians call Langevin’ algorithm (MALA) as smart Monte Carlo (Rossky et al., …1978!!!). Recovering Hamiltonian and hybrid Monte Carlo algorithms as a fusion of molecular dynamics, Verlet algorithm, and Metropolis acceptance step! As well as reminding us of the physics roots of umbrella sampling and the Wang-Landau algorithm.

**O**miros Papaspiliopoulos also gave a very pedagogical entry to the convergence of MCMC samplers which focussed on the L² approach to convergence. This reminded me of the very first papers published on the convergence of the Gibbs sampler, like the 1990 1992 JCGS paper by Schervish and Carlin. Or the 1991 1996 Annals of Statistics by Amit. (Funny that I located both papers much earlier than when they actually appeared!) One surprising fact was that the convergence of all reversible ergodic kernels is necessarily geometric. There is no classification of kernels in this topology, the only ranking being through the respective spectral gaps. A good refresher for most of the audience, statisticians included.

**T**he following talks of Day 1 were by Christophe Andrieu, who kept with the spirit of a highly pedagogical entry, covering particle filters, SMC, particle Gibbs and pseudo-marginals, and who hit the right tone I think given the heterogeneous audience. And by Ben Leimkuhler about particle simulation for very large molecular structures. Closing the day by focussing on Langevin dynamics. What I understood from the talk was an improved entry into the resolution of some SPDEs. Gaining two orders when compared with Euler-Marayama. But missed the meaning of the friction coefficient γ converging to infinity in the title…

Filed under: Mountains, pictures, Running, Statistics, Travel, University life Tagged: ABC, Arthur's Seat, computational physics, Edinburgh, Hamburg, Highlands, ICMS, MCMC, molecular simulation, Monte Carlo Statistical Methods, munroes, NIPS 2014, Scotland

### Ben Lawers, Perthshire

Filed under: Mountains, pictures, Running, Travel Tagged: An Stuc, Ben Lawers, munroes, Pertshire, Scotland. Highlands

### Ben Lawers, Perthshire

Filed under: Mountains, pictures, Running, Travel Tagged: An Stuc, Ben Lawers, munroes, Pertshire, Scotland. Highlands

### improved approximate-Bayesian model-choice method for estimating shared evolutionary history [reply from the author]

*[Here is a very kind and detailed reply from Jamie Oakes to the comments I made on his ABC paper a few days ago:]*

First of all, many thanks for your thorough review of my pre-print! It is very helpful and much appreciated. I just wanted to comment on a few things you address in your post.

I am a little confused about how my replacement of continuous uniform probability distributions with gamma distributions for priors on several parameters introduces a potentially crippling number of hyperparameters. Both uniform and gamma distributions have two parameters. So, the new model only has one additional hyperparameter compared to the original msBayes model: the concentration parameter on the Dirichlet process prior on divergence models. Also, the new model offers a uniform prior over divergence models (though I don’t recommend it).

Your comment about there being no new ABC technique is 100% correct. The model is new, the ABC numerical machinery is not. Also, your intuition is correct, I do not use the divergence times to calculate summary statistics. I mention the divergence times in the description of the ABC algorithm with the hope of making it clear that the times are scaled (see Equation (12)) prior to the simulation of the data (from which the summary statistics are calculated). This scaling is simply to go from units proportional to time, to units that are proportional to the expected number of mutations. Clearly, my attempt at clarity only created unnecessary opacity. I’ll have to make some edits.

Regarding the reshuffling of the summary statistics calculated from different alignments of sequences, the statistics are not exchangeable. So, reshuffling them in a manner that is not conistent across all simulations and the observed data is not mathematically valid. Also, if elements are exchangeable, their order will not affect the likelihood (or the posterior, barring sampling error). Thus, if our goal is to approximate the likelihood, I would hope the reshuffling would also have little affect on the approximate posterior (otherwise my approximation is not so good?).

You are correct that my use of “bias” was not well defined in reference to the identity line of my plots of the estimated vs true probability of the one-divergence model. I think we can agree that, ideally (all assumptions are met), the estimated posterior probability of a model should estimate the probability that the model is correct. For large numbers of simulation

replicates, the proportion of the replicates for which the one-divergence model is true will approximate the probability that the one-divergence model is correct. Thus, if the method has the desirable (albeit “frequentist”) behavior such that the estimated posterior probability of the one-divergence model is an unbiased estimate of the probability that the one-divergence model is correct, the points should fall near the identity line. For example, let us say the method estimates a posterior probability of 0.90 for the one-divergence model for 1000 simulated datasets. If the method is accurately estimating the probability that the one-divergence model is the correct model, then the one-divergence model should be the true model for approximately 900 of the 1000 datasets. Any trend away from the identity line indicates the method is biased in the (frequentist) sense that it is not correctly estimating the probability that the one-divergence model is the correct model. I agree this measure of “bias” is frequentist in nature. However, it seems like a worthwhile goal for Bayesian model-choice methods to have good frequentist properties. If a method strongly deviates from the identity line, it is much more difficult to interpret the posterior probabilites that it estimates. Going back to my example of the posterior probability of 0.90 for 1000 replicates, I would be alarmed if the model was true in only 100 of the replicates.

My apologies if my citation of your PNAS paper seemed misleading. The citation was intended to be limited to the context of ABC methods that use summary statistics that are insufficient across the models under comparison (like msBayes and the method I present in the paper). I will definitely expand on this sentence to make this clearer in revisions. Thanks!

Lastly, my concluding remarks in the paper about full-likelihood methods in this domain are not as lofty as you might think. The likelihood function of the msBayes model is tractable, and, in fact, has already been derived and implemented via reversible-jump MCMC (albeit, not readily available yet). Also, there are plenty of examples of rich, Kingman-coalescent models implemented in full-likelihood Bayesian frameworks. Too many to list, but a lot of them are implemented in the BEAST software package. One noteworthy example is the work of Bryant et al. (2012, Molecular Biology and Evolution, 29(8), 1917–32) that analytically integrates over all gene trees for biallelic markers under the coalescent.

Filed under: Books, Statistics, University life Tagged: ABC, Bayesian statistics, consistence, Dirichlet process, exchangeability, frequency properties, Kingman's coalescent, Molecular Biology and Evolution, Monte Carlo Statistical Methods, reversible jump, sufficiency, summary statistics, taxon

### improved approximate-Bayesian model-choice method for estimating shared evolutionary history [reply from the author]

*[Here is a very kind and detailed reply from Jamie Oakes to the comments I made on his ABC paper a few days ago:]*

First of all, many thanks for your thorough review of my pre-print! It is very helpful and much appreciated. I just wanted to comment on a few things you address in your post.

I am a little confused about how my replacement of continuous uniform probability distributions with gamma distributions for priors on several parameters introduces a potentially crippling number of hyperparameters. Both uniform and gamma distributions have two parameters. So, the new model only has one additional hyperparameter compared to the original msBayes model: the concentration parameter on the Dirichlet process prior on divergence models. Also, the new model offers a uniform prior over divergence models (though I don’t recommend it).

Your comment about there being no new ABC technique is 100% correct. The model is new, the ABC numerical machinery is not. Also, your intuition is correct, I do not use the divergence times to calculate summary statistics. I mention the divergence times in the description of the ABC algorithm with the hope of making it clear that the times are scaled (see Equation (12)) prior to the simulation of the data (from which the summary statistics are calculated). This scaling is simply to go from units proportional to time, to units that are proportional to the expected number of mutations. Clearly, my attempt at clarity only created unnecessary opacity. I’ll have to make some edits.

Regarding the reshuffling of the summary statistics calculated from different alignments of sequences, the statistics are not exchangeable. So, reshuffling them in a manner that is not conistent across all simulations and the observed data is not mathematically valid. Also, if elements are exchangeable, their order will not affect the likelihood (or the posterior, barring sampling error). Thus, if our goal is to approximate the likelihood, I would hope the reshuffling would also have little affect on the approximate posterior (otherwise my approximation is not so good?).

You are correct that my use of “bias” was not well defined in reference to the identity line of my plots of the estimated vs true probability of the one-divergence model. I think we can agree that, ideally (all assumptions are met), the estimated posterior probability of a model should estimate the probability that the model is correct. For large numbers of simulation

replicates, the proportion of the replicates for which the one-divergence model is true will approximate the probability that the one-divergence model is correct. Thus, if the method has the desirable (albeit “frequentist”) behavior such that the estimated posterior probability of the one-divergence model is an unbiased estimate of the probability that the one-divergence model is correct, the points should fall near the identity line. For example, let us say the method estimates a posterior probability of 0.90 for the one-divergence model for 1000 simulated datasets. If the method is accurately estimating the probability that the one-divergence model is the correct model, then the one-divergence model should be the true model for approximately 900 of the 1000 datasets. Any trend away from the identity line indicates the method is biased in the (frequentist) sense that it is not correctly estimating the probability that the one-divergence model is the correct model. I agree this measure of “bias” is frequentist in nature. However, it seems like a worthwhile goal for Bayesian model-choice methods to have good frequentist properties. If a method strongly deviates from the identity line, it is much more difficult to interpret the posterior probabilites that it estimates. Going back to my example of the posterior probability of 0.90 for 1000 replicates, I would be alarmed if the model was true in only 100 of the replicates.

My apologies if my citation of your PNAS paper seemed misleading. The citation was intended to be limited to the context of ABC methods that use summary statistics that are insufficient across the models under comparison (like msBayes and the method I present in the paper). I will definitely expand on this sentence to make this clearer in revisions. Thanks!

Lastly, my concluding remarks in the paper about full-likelihood methods in this domain are not as lofty as you might think. The likelihood function of the msBayes model is tractable, and, in fact, has already been derived and implemented via reversible-jump MCMC (albeit, not readily available yet). Also, there are plenty of examples of rich, Kingman-coalescent models implemented in full-likelihood Bayesian frameworks. Too many to list, but a lot of them are implemented in the BEAST software package. One noteworthy example is the work of Bryant et al. (2012, Molecular Biology and Evolution, 29(8), 1917–32) that analytically integrates over all gene trees for biallelic markers under the coalescent.

Filed under: Books, Statistics, University life Tagged: ABC, Bayesian statistics, consistence, Dirichlet process, exchangeability, frequency properties, Kingman's coalescent, Molecular Biology and Evolution, Monte Carlo Statistical Methods, reversible jump, sufficiency, summary statistics, taxon

### 5 Munros, enough for a day…

**T**aking advantage of cheap [early] Sunday morning flights to Edinburgh, I managed to bag a good hiking day (and three new Munros) within my trip to Scotland. I decided about the hike in the plane, picking the Lawers group as one of the closest to Edinburgh… The fair sequence of Munros in the group (5!) made it quite appealing [for a Munro-bagger], until I realised I would have to walk on a narrow road with no side-walk for 6km to complete the loop. Hence I decided on turning back after the third peak (An Stuc, recently promoted to Munro-fame!), which meant re-climbing the first two Munros from the “other” side, with a significant addition to the total differential (+1500m). The weather was traditional Scottish, with plenty of clouds, gales and gusts, a few patches of blue sky, and a pleasant drizzle for the last hour. It did not seem to bother the numerous walkers passed on the first part of the trail. As usual, an additional reward with hiking or climbing in Scotland is that one can be back in time in town (i.e., Edinburgh) for the evening curry! Even when leaving from Paris in the morning.

Filed under: Mountains, pictures, Running, University life Tagged: Ben Lawers, curry, Edinburgh, ICMS, munroes, Paris, Scotland