## Bayesian News Feeds

### computational methods for statistical mechanics [day #1]

**T**he first talks of the day at this ICMS workshop ["at the interface between mathematical statistics and molecular simulation"] were actually lectures introducing molecular simulation to statisticians by Michael Allen from Warwick and computational statistics to physicists by Omiros Papaspiliopoulos. Allen’s lecture was quite pedagogical, even though I had to quiz wikipedia for physics terms and notions. Like a force being the gradient of a potential function. He gave a physical meaning to Langevin’ equation. As well as references from the *Journal of Chemical Physics* that were more recent than 1953. He mentioned alternatives to Langevin’s equation too and I idly wondered at the possibility of using those alternatives as other tools for improved MCMC simulation. Although introducing friction may not be the most promising way to speed up the thing… He later introduced what statisticians call Langevin’ algorithm (MALA) as smart Monte Carlo (Rossky et al., …1978!!!). Recovering Hamiltonian and hybrid Monte Carlo algorithms as a fusion of molecular dynamics, Verlet algorithm, and Metropolis acceptance step! As well as reminding us of the physics roots of umbrella sampling and the Wang-Landau algorithm.

**O**miros Papaspiliopoulos also gave a very pedagogical entry to the convergence of MCMC samplers which focussed on the L² approach to convergence. This reminded me of the very first papers published on the convergence of the Gibbs sampler, like the 1990 1992 JCGS paper by Schervish and Carlin. Or the 1991 1996 Annals of Statistics by Amit. (Funny that I located both papers much earlier than when they actually appeared!) One surprising fact was that the convergence of all reversible ergodic kernels is necessarily geometric. There is no classification of kernels in this topology, the only ranking being through the respective spectral gaps. A good refresher for most of the audience, statisticians included.

**T**he following talks of Day 1 were by Christophe Andrieu, who kept with the spirit of a highly pedagogical entry, covering particle filters, SMC, particle Gibbs and pseudo-marginals, and who hit the right tone I think given the heterogeneous audience. And by Ben Leimkuhler about particle simulation for very large molecular structures. Closing the day by focussing on Langevin dynamics. What I understood from the talk was an improved entry into the resolution of some SPDEs. Gaining two orders when compared with Euler-Marayama. But missed the meaning of the friction coefficient γ converging to infinity in the title…

Filed under: Mountains, pictures, Running, Statistics, Travel, University life Tagged: ABC, Arthur's Seat, computational physics, Edinburgh, Hamburg, Highlands, ICMS, MCMC, molecular simulation, Monte Carlo Statistical Methods, munroes, NIPS 2014, Scotland

### computational methods for statistical mechanics [day #1]

**T**he first talks of the day at this ICMS workshop ["at the interface between mathematical statistics and molecular simulation"] were actually lectures introducing molecular simulation to statisticians by Michael Allen from Warwick and computational statistics to physicists by Omiros Papaspiliopoulos. Allen’s lecture was quite pedagogical, even though I had to quiz wikipedia for physics terms and notions. Like a force being the gradient of a potential function. He gave a physical meaning to Langevin’ equation. As well as references from the *Journal of Chemical Physics* that were more recent than 1953. He mentioned alternatives to Langevin’s equation too and I idly wondered at the possibility of using those alternatives as other tools for improved MCMC simulation. Although introducing friction may not be the most promising way to speed up the thing… He later introduced what statisticians call Langevin’ algorithm (MALA) as smart Monte Carlo (Rossky et al., …1978!!!). Recovering Hamiltonian and hybrid Monte Carlo algorithms as a fusion of molecular dynamics, Verlet algorithm, and Metropolis acceptance step! As well as reminding us of the physics roots of umbrella sampling and the Wang-Landau algorithm.

**O**miros Papaspiliopoulos also gave a very pedagogical entry to the convergence of MCMC samplers which focussed on the L² approach to convergence. This reminded me of the very first papers published on the convergence of the Gibbs sampler, like the 1990 1992 JCGS paper by Schervish and Carlin. Or the 1991 1996 Annals of Statistics by Amit. (Funny that I located both papers much earlier than when they actually appeared!) One surprising fact was that the convergence of all reversible ergodic kernels is necessarily geometric. There is no classification of kernels in this topology, the only ranking being through the respective spectral gaps. A good refresher for most of the audience, statisticians included.

**T**he following talks of Day 1 were by Christophe Andrieu, who kept with the spirit of a highly pedagogical entry, covering particle filters, SMC, particle Gibbs and pseudo-marginals, and who hit the right tone I think given the heterogeneous audience. And by Ben Leimkuhler about particle simulation for very large molecular structures. Closing the day by focussing on Langevin dynamics. What I understood from the talk was an improved entry into the resolution of some SPDEs. Gaining two orders when compared with Euler-Marayama. But missed the meaning of the friction coefficient γ converging to infinity in the title…

Filed under: Mountains, pictures, Running, Statistics, Travel, University life Tagged: ABC, Arthur's Seat, computational physics, Edinburgh, Hamburg, Highlands, ICMS, MCMC, molecular simulation, Monte Carlo Statistical Methods, munroes, NIPS 2014, Scotland

### Ben Lawers, Perthshire

Filed under: Mountains, pictures, Running, Travel Tagged: An Stuc, Ben Lawers, munroes, Pertshire, Scotland. Highlands

### Ben Lawers, Perthshire

Filed under: Mountains, pictures, Running, Travel Tagged: An Stuc, Ben Lawers, munroes, Pertshire, Scotland. Highlands

### improved approximate-Bayesian model-choice method for estimating shared evolutionary history [reply from the author]

*[Here is a very kind and detailed reply from Jamie Oakes to the comments I made on his ABC paper a few days ago:]*

First of all, many thanks for your thorough review of my pre-print! It is very helpful and much appreciated. I just wanted to comment on a few things you address in your post.

I am a little confused about how my replacement of continuous uniform probability distributions with gamma distributions for priors on several parameters introduces a potentially crippling number of hyperparameters. Both uniform and gamma distributions have two parameters. So, the new model only has one additional hyperparameter compared to the original msBayes model: the concentration parameter on the Dirichlet process prior on divergence models. Also, the new model offers a uniform prior over divergence models (though I don’t recommend it).

Your comment about there being no new ABC technique is 100% correct. The model is new, the ABC numerical machinery is not. Also, your intuition is correct, I do not use the divergence times to calculate summary statistics. I mention the divergence times in the description of the ABC algorithm with the hope of making it clear that the times are scaled (see Equation (12)) prior to the simulation of the data (from which the summary statistics are calculated). This scaling is simply to go from units proportional to time, to units that are proportional to the expected number of mutations. Clearly, my attempt at clarity only created unnecessary opacity. I’ll have to make some edits.

Regarding the reshuffling of the summary statistics calculated from different alignments of sequences, the statistics are not exchangeable. So, reshuffling them in a manner that is not conistent across all simulations and the observed data is not mathematically valid. Also, if elements are exchangeable, their order will not affect the likelihood (or the posterior, barring sampling error). Thus, if our goal is to approximate the likelihood, I would hope the reshuffling would also have little affect on the approximate posterior (otherwise my approximation is not so good?).

You are correct that my use of “bias” was not well defined in reference to the identity line of my plots of the estimated vs true probability of the one-divergence model. I think we can agree that, ideally (all assumptions are met), the estimated posterior probability of a model should estimate the probability that the model is correct. For large numbers of simulation

replicates, the proportion of the replicates for which the one-divergence model is true will approximate the probability that the one-divergence model is correct. Thus, if the method has the desirable (albeit “frequentist”) behavior such that the estimated posterior probability of the one-divergence model is an unbiased estimate of the probability that the one-divergence model is correct, the points should fall near the identity line. For example, let us say the method estimates a posterior probability of 0.90 for the one-divergence model for 1000 simulated datasets. If the method is accurately estimating the probability that the one-divergence model is the correct model, then the one-divergence model should be the true model for approximately 900 of the 1000 datasets. Any trend away from the identity line indicates the method is biased in the (frequentist) sense that it is not correctly estimating the probability that the one-divergence model is the correct model. I agree this measure of “bias” is frequentist in nature. However, it seems like a worthwhile goal for Bayesian model-choice methods to have good frequentist properties. If a method strongly deviates from the identity line, it is much more difficult to interpret the posterior probabilites that it estimates. Going back to my example of the posterior probability of 0.90 for 1000 replicates, I would be alarmed if the model was true in only 100 of the replicates.

My apologies if my citation of your PNAS paper seemed misleading. The citation was intended to be limited to the context of ABC methods that use summary statistics that are insufficient across the models under comparison (like msBayes and the method I present in the paper). I will definitely expand on this sentence to make this clearer in revisions. Thanks!

Lastly, my concluding remarks in the paper about full-likelihood methods in this domain are not as lofty as you might think. The likelihood function of the msBayes model is tractable, and, in fact, has already been derived and implemented via reversible-jump MCMC (albeit, not readily available yet). Also, there are plenty of examples of rich, Kingman-coalescent models implemented in full-likelihood Bayesian frameworks. Too many to list, but a lot of them are implemented in the BEAST software package. One noteworthy example is the work of Bryant et al. (2012, Molecular Biology and Evolution, 29(8), 1917–32) that analytically integrates over all gene trees for biallelic markers under the coalescent.

Filed under: Books, Statistics, University life Tagged: ABC, Bayesian statistics, consistence, Dirichlet process, exchangeability, frequency properties, Kingman's coalescent, Molecular Biology and Evolution, Monte Carlo Statistical Methods, reversible jump, sufficiency, summary statistics, taxon

### improved approximate-Bayesian model-choice method for estimating shared evolutionary history [reply from the author]

*[Here is a very kind and detailed reply from Jamie Oakes to the comments I made on his ABC paper a few days ago:]*

First of all, many thanks for your thorough review of my pre-print! It is very helpful and much appreciated. I just wanted to comment on a few things you address in your post.

I am a little confused about how my replacement of continuous uniform probability distributions with gamma distributions for priors on several parameters introduces a potentially crippling number of hyperparameters. Both uniform and gamma distributions have two parameters. So, the new model only has one additional hyperparameter compared to the original msBayes model: the concentration parameter on the Dirichlet process prior on divergence models. Also, the new model offers a uniform prior over divergence models (though I don’t recommend it).

Your comment about there being no new ABC technique is 100% correct. The model is new, the ABC numerical machinery is not. Also, your intuition is correct, I do not use the divergence times to calculate summary statistics. I mention the divergence times in the description of the ABC algorithm with the hope of making it clear that the times are scaled (see Equation (12)) prior to the simulation of the data (from which the summary statistics are calculated). This scaling is simply to go from units proportional to time, to units that are proportional to the expected number of mutations. Clearly, my attempt at clarity only created unnecessary opacity. I’ll have to make some edits.

Regarding the reshuffling of the summary statistics calculated from different alignments of sequences, the statistics are not exchangeable. So, reshuffling them in a manner that is not conistent across all simulations and the observed data is not mathematically valid. Also, if elements are exchangeable, their order will not affect the likelihood (or the posterior, barring sampling error). Thus, if our goal is to approximate the likelihood, I would hope the reshuffling would also have little affect on the approximate posterior (otherwise my approximation is not so good?).

You are correct that my use of “bias” was not well defined in reference to the identity line of my plots of the estimated vs true probability of the one-divergence model. I think we can agree that, ideally (all assumptions are met), the estimated posterior probability of a model should estimate the probability that the model is correct. For large numbers of simulation

replicates, the proportion of the replicates for which the one-divergence model is true will approximate the probability that the one-divergence model is correct. Thus, if the method has the desirable (albeit “frequentist”) behavior such that the estimated posterior probability of the one-divergence model is an unbiased estimate of the probability that the one-divergence model is correct, the points should fall near the identity line. For example, let us say the method estimates a posterior probability of 0.90 for the one-divergence model for 1000 simulated datasets. If the method is accurately estimating the probability that the one-divergence model is the correct model, then the one-divergence model should be the true model for approximately 900 of the 1000 datasets. Any trend away from the identity line indicates the method is biased in the (frequentist) sense that it is not correctly estimating the probability that the one-divergence model is the correct model. I agree this measure of “bias” is frequentist in nature. However, it seems like a worthwhile goal for Bayesian model-choice methods to have good frequentist properties. If a method strongly deviates from the identity line, it is much more difficult to interpret the posterior probabilites that it estimates. Going back to my example of the posterior probability of 0.90 for 1000 replicates, I would be alarmed if the model was true in only 100 of the replicates.

My apologies if my citation of your PNAS paper seemed misleading. The citation was intended to be limited to the context of ABC methods that use summary statistics that are insufficient across the models under comparison (like msBayes and the method I present in the paper). I will definitely expand on this sentence to make this clearer in revisions. Thanks!

Lastly, my concluding remarks in the paper about full-likelihood methods in this domain are not as lofty as you might think. The likelihood function of the msBayes model is tractable, and, in fact, has already been derived and implemented via reversible-jump MCMC (albeit, not readily available yet). Also, there are plenty of examples of rich, Kingman-coalescent models implemented in full-likelihood Bayesian frameworks. Too many to list, but a lot of them are implemented in the BEAST software package. One noteworthy example is the work of Bryant et al. (2012, Molecular Biology and Evolution, 29(8), 1917–32) that analytically integrates over all gene trees for biallelic markers under the coalescent.

Filed under: Books, Statistics, University life Tagged: ABC, Bayesian statistics, consistence, Dirichlet process, exchangeability, frequency properties, Kingman's coalescent, Molecular Biology and Evolution, Monte Carlo Statistical Methods, reversible jump, sufficiency, summary statistics, taxon

### 5 Munros, enough for a day…

**T**aking advantage of cheap [early] Sunday morning flights to Edinburgh, I managed to bag a good hiking day (and three new Munros) within my trip to Scotland. I decided about the hike in the plane, picking the Lawers group as one of the closest to Edinburgh… The fair sequence of Munros in the group (5!) made it quite appealing [for a Munro-bagger], until I realised I would have to walk on a narrow road with no side-walk for 6km to complete the loop. Hence I decided on turning back after the third peak (An Stuc, recently promoted to Munro-fame!), which meant re-climbing the first two Munros from the “other” side, with a significant addition to the total differential (+1500m). The weather was traditional Scottish, with plenty of clouds, gales and gusts, a few patches of blue sky, and a pleasant drizzle for the last hour. It did not seem to bother the numerous walkers passed on the first part of the trail. As usual, an additional reward with hiking or climbing in Scotland is that one can be back in time in town (i.e., Edinburgh) for the evening curry! Even when leaving from Paris in the morning.

Filed under: Mountains, pictures, Running, University life Tagged: Ben Lawers, curry, Edinburgh, ICMS, munroes, Paris, Scotland

### 5 Munros, enough for a day…

**T**aking advantage of cheap [early] Sunday morning flights to Edinburgh, I managed to bag a good hiking day (and three new Munros) within my trip to Scotland. I decided about the hike in the plane, picking the Lawers group as one of the closest to Edinburgh… The fair sequence of Munros in the group (5!) made it quite appealing [for a Munro-bagger], until I realised I would have to walk on a narrow road with no side-walk for 6km to complete the loop. Hence I decided on turning back after the third peak (An Stuc, recently promoted to Munro-fame!), which meant re-climbing the first two Munros from the “other” side, with a significant addition to the total differential (+1500m). The weather was traditional Scottish, with plenty of clouds, gales and gusts, a few patches of blue sky, and a pleasant drizzle for the last hour. It did not seem to bother the numerous walkers passed on the first part of the trail. As usual, an additional reward with hiking or climbing in Scotland is that one can be back in time in town (i.e., Edinburgh) for the evening curry! Even when leaving from Paris in the morning.

Filed under: Mountains, pictures, Running, University life Tagged: Ben Lawers, curry, Edinburgh, ICMS, munroes, Paris, Scotland

### [h]it figures

**J**ust a few figures from wordpress about the ‘Og:

**2,845**posts;**1,009,428**views;**5,115**comments;**5,095**tags;**470,427**spam comments;**1,001**spams in the past 24 hours;- and… only
**5**amazon orders in the past month!

Filed under: Books, pictures Tagged: Amazon, blog, comments, New York Subway, spams, tags

### [h]it figures

**J**ust a few figures from wordpress about the ‘Og:

**2,845**posts;**1,009,428**views;**5,115**comments;**5,095**tags;**470,427**spam comments;**1,001**spams in the past 24 hours;- and… only
**5**amazon orders in the past month!

Filed under: Books, pictures Tagged: Amazon, blog, comments, New York Subway, spams, tags

### Le Monde puzzle [#868]

**A**nother permutation-based Le Monde mathematical puzzle:

*Given the integers 1,…n, a “perfect” combination is a pair (i,j) of integers such that no other pair enjoys the same sum. For n=33, what is the maximum of perfect combinations one can build? **And for n=214? *

**A** rather straightforward problem, or so it seemed: take the pairs (2m,2m+1), their sums all differ, and we get the maximal possible number of sums, ⌊n/2⌋… However, I did not read the question properly (!) and the constraint is on the sum (i+j), namely

*How many mutually exclusive pairs (i,j) can be found with different sums all bounded by n=33? n=2014?*

**I**n which case, the previous and obvious proposal works no longer… The dumb brute-force search

leads to a solution of

> sol [1] 12 > laperm [1] 6 9 1 24 13 20 4 7 21 14 17 3 16 11 19 25 23 18 12 26 15 2 5 10 22 [26] 8 > unique(apply(matrix(laperm,ncol=2),1,sum)) [1] 17 28 26 47 31 32 30 22 23 19 27 25 24which is close of the solution sol=13 proposed in Le Monde… It is obviously hopeless for a sum bounded by 2014. A light attempt at simulated annealing did not help either.

Filed under: Books, Kids, Statistics Tagged: Le Monde, mathematical puzzle, permutations, simulated annealing

### Le Monde puzzle [#868]

**A**nother permutation-based Le Monde mathematical puzzle:

*Given the integers 1,…n, a “perfect” combination is a pair (i,j) of integers such that no other pair enjoys the same sum. For n=33, what is the maximum of perfect combinations one can build? **And for n=214? *

**A** rather straightforward problem, or so it seemed: take the pairs (2m,2m+1), their sums all differ, and we get the maximal possible number of sums, ⌊n/2⌋… However, I did not read the question properly (!) and the constraint is on the sum (i+j), namely

*How many mutually exclusive pairs (i,j) can be found with different sums all bounded by n=33? n=2014?*

**I**n which case, the previous and obvious proposal works no longer… The dumb brute-force search

leads to a solution of

> sol [1] 12 > laperm [1] 6 9 1 24 13 20 4 7 21 14 17 3 16 11 19 25 23 18 12 26 15 2 5 10 22 [26] 8 > unique(apply(matrix(laperm,ncol=2),1,sum)) [1] 17 28 26 47 31 32 30 22 23 19 27 25 24which is close of the solution sol=13 proposed in Le Monde… It is obviously hopeless for a sum bounded by 2014. A light attempt at simulated annealing did not help either.

Filed under: Books, Kids, Statistics Tagged: Le Monde, mathematical puzzle, permutations, simulated annealing

### de.activated!

### de.activated!

### Bayesian Analysis, Volume 9, Number 2 (2014)

Contents:

**Zhihua Zhang**, **Dakan Wang**, **Guang Dai**, **Michael I. Jordan**. Matrix-Variate Dirichlet Process Priors with Applications. 259--286.

**Nammam Ali Azadi**, **Paul Fearnhead**, **Gareth Ridall**, **Joleen H. Blok**. Bayesian Sequential Experimental Design for Binary Response Data with Application to Electromyographic Experiments. 287--306.

**Juhee Lee**, **Steven N. MacEachern**, **Yiling Lu**, **Gordon B. Mills**. Local-Mass Preserving Prior Distributions for Nonparametric Bayesian Models. 307--330.

**Ruitao Liu**, **Arijit Chakrabarti**, **Tapas Samanta**, **Jayanta K. Ghosh**, **Malay Ghosh**. On Divergence Measures Leading to Jeffreys and Other Reference Priors. 331--370.

**Xin-Yuan Song**, **Jing-Heng Cai**, **Xiang-Nan Feng**, **Xue-Jun Jiang**. Bayesian Analysis of the Functional-Coefficient Autoregressive Heteroscedastic Model. 371--396.

**Yu Ryan Yue**, **Daniel Simpson**, **Finn Lindgren**, **Håvard Rue**. Bayesian Adaptive Smoothing Splines Using Stochastic Differential Equations. 397--424.

**Jaakko Riihimäki**, **Aki Vehtari**. Laplace Approximation for Logistic Gaussian Process Density Estimation and Regression. 425--448.

**Fei Liu**, **Sounak Chakraborty**, **Fan Li**, **Yan Liu**, **Aurelie C. Lozano**. Bayesian Regularization via Graph Laplacian. 449--474.

**Catia Scricciolo**. Adaptive Bayesian Density Estimation in $L^{p}$ -metrics with Pitman-Yor or Normalized Inverse-Gaussian Process Kernel Mixtures. 475--520.

### Matrix-Variate Dirichlet Process Priors with Applications

**Zhihua Zhang**,

**Dakan Wang**,

**Guang Dai**,

**Michael I. Jordan**.

**Source: **Bayesian Analysis, Volume 9, Number 2, 259--286.

**Abstract:**

In this paper we propose a matrix-variate Dirichlet process (MATDP) for modeling the joint prior of a set of random matrices. Our approach is able to share statistical strength among regression coefficient matrices due to the clustering property of the Dirichlet process. Moreover, since the base probability measure is defined as a matrix-variate distribution, the dependence among the elements of each random matrix is described via the matrix-variate distribution. We apply MATDP to multivariate supervised learning problems. In particular, we devise a nonparametric discriminative model and a nonparametric latent factor model. The interest is in considering correlations both across response variables (or covariates) and across response vectors. We derive Markov chain Monte Carlo algorithms for posterior inference and prediction, and illustrate the application of the models to multivariate regression, multi-class classification and multi-label prediction problems.

### Bayesian Sequential Experimental Design for Binary Response Data with Application to Electromyographic Experiments

**Nammam Ali Azadi**,

**Paul Fearnhead**,

**Gareth Ridall**,

**Joleen H. Blok**.

**Source: **Bayesian Analysis, Volume 9, Number 2, 287--306.

**Abstract:**

We develop a sequential Monte Carlo approach for Bayesian analysis of the experimental design for binary response data. Our work is motivated by surface electromyographic (SEMG) experiments, which can be used to provide information about the functionality of subjects’ motor units. These experiments involve a series of stimuli being applied to a motor unit, with whether or not the motor unit fires for each stimulus being recorded. The aim is to learn about how the probability of firing depends on the applied stimulus (the so-called stimulus-response curve). One such excitability parameter is an estimate of the stimulus level for which the motor unit has a 50% chance of firing. Within such an experiment we are able to choose the next stimulus level based on the past observations. We show how sequential Monte Carlo can be used to analyse such data in an online manner. We then use the current estimate of the posterior distribution in order to choose the next stimulus level. The aim is to select a stimulus level that mimimises the expected loss of estimating a quantity, or quantities, of interest. We will apply this loss function to the estimates of target quantiles from the stimulus-response curve. Through simulation we show that this approach is more efficient than existing sequential design methods in terms of estimating the quantile(s) of interest. If applied in practice, it could reduce the length of SEMG experiments by a factor of three.

### Local-Mass Preserving Prior Distributions for Nonparametric Bayesian Models

**Juhee Lee**,

**Steven N. MacEachern**,

**Yiling Lu**,

**Gordon B. Mills**.

**Source: **Bayesian Analysis, Volume 9, Number 2, 307--330.

**Abstract:**

We address the problem of prior specification for models involving the two-parameter Poisson-Dirichlet process. These models are sometimes partially subjectively specified and are always partially (or fully) specified by a rule. We develop prior distributions based on local mass preservation. The robustness of posterior inference to an arbitrary choice of overdispersion under the proposed and current priors is investigated. Two examples are provided to demonstrate the properties of the proposed priors. We focus on the three major types of inference: clustering of the parameters of interest, estimation and prediction. The new priors are found to provide more stable inference about clustering than traditional priors while showing few drawbacks. Furthermore, it is shown that more stable clustering results in more stable inference for estimation and prediction. We recommend the local-mass preserving priors as a replacement for the traditional priors.

### On Divergence Measures Leading to Jeffreys and Other Reference Priors

**Ruitao Liu**,

**Arijit Chakrabarti**,

**Tapas Samanta**,

**Jayanta K. Ghosh**,

**Malay Ghosh**.

**Source: **Bayesian Analysis, Volume 9, Number 2, 331--370.

**Abstract:**

The paper presents new measures of divergence between prior and posterior which are maximized by the Jeffreys prior. We provide two methods for proving this, one of which provides an easy to verify sufficient condition. We use such divergences to measure information in a prior and also obtain new objective priors outside the class of Bernardo’s reference priors.

### Bayesian Analysis of the Functional-Coefficient Autoregressive Heteroscedastic Model

**Xin-Yuan Song**,

**Jing-Heng Cai**,

**Xiang-Nan Feng**,

**Xue-Jun Jiang**.

**Source: **Bayesian Analysis, Volume 9, Number 2, 371--396.

**Abstract:**

In this paper, we propose a new model called the functional-coefficient autoregressive heteroscedastic (FARCH) model for nonlinear time series. The FARCH model extends the existing functional-coefficient autoregressive models and double-threshold autoregressive heteroscedastic models by providing a flexible framework for the detection of nonlinear features for both the conditional mean and conditional variance. We propose a Bayesian approach, along with the Bayesian P-splines technique and Markov chain Monte Carlo algorithm, to estimate the functional coefficients and unknown parameters of the model. We also conduct model comparison via the Bayes factor. The performance of the proposed methodology is evaluated via a simulation study. A real data set derived from the daily S&P 500 Composite Index is used to illustrate the methodology.