CONFERENCE ABSTRACTS

 
:: Keynote Speaker Abstracts::
 
Bayesian Adjustment for Multiplicity
Jim Berger

Issues of multiplicity in testing are increasingly being encountered in a wide range of disciplines, as the growing complexity of data allows for consideration of a multitude of possible hypotheses (e.g., does gene xyz affect condition abc). Failure to properly adjust for multiplicities is being blamed for the apparently increasing lack of reproducibility in science. The main purpose of this presentation is to review the different types of multiplicities that are encountered, and to discuss the general approaches to dealing with them that are being adopted by Bayesians. Issues that I found surprising will be highlighted, such as the fact that empirical Bayesian approaches to multiplicity adjustment can be seriously flawed.

 
Auxiliary Mixture Sampling - Simple MCMC for Non-Gaussian Models
Sylvia Frühwirth-Schnatter

Applied statisticians often have to deal with modelling discrete-valued response variable, in particular binary, multinomial or count data, in terms of covariates. These models typically take the form of (dynamic) generalized linear models involving latent variables, like mixed effect models, or state space models. Parameter estimation for these types of models is known to be computationally demanding and sophisticated numerical techniques have been applied like importance sampling or a Metropolis-Hastings algorithm.

This talk will review as simple method for the MCMC estimation of such models based on auxiliary mixture sampling. Auxiliary mixture sampling allows straightforward estimation for rather general parameter-driven models for discrete-valued data like random effect models, mixture models or state space models.

Furthermore, it will be demonstrated that auxiliary mixture sampling is particularly useful for implementing model space MCMC methods for such model. Applications to variable selection for logistic regression models, covariance selection for non-Gaussian random effects models and model specification for a Poisson state space model will be discussed.

 
A Theory of (Un)congeniality (between Bayesians and Frequentists?)
Xiao-Li Meng

A grand challenge in producing public-use databases is that the models/assumptions used to “clean up” the raw data cannot possibly be compatible with all subsequent models or assumptions adopted by the users of the database. This challenge requires us to rethink the usual “My model” vs. “God’s Model” paradigm, because there is a third model: the one adopted by the “data cleaner.” A concrete example of this arises in the context of using multiple imputation to “fix the holes” in raw data. Multiple imputation, in general, is best done by the data collector via posterior prediction, which properly takes into account the uncertainty in predicting the missing values. Yet many users of the resulting data do not even consider a likelihood, let alone Bayesian modeling, but rather employ a design-based complete-data procedure. This talk first reviews the concept of congeniality (Meng, 1994, Statistical Science) for embedding the user’s procedure into a Bayesian model and hence making it possible to study the incompatibility (i.e., uncongenality) between the Bayesian imputation model and the frequentist analysis procedure. We then present a newly established theoretical framework for quantifying the impact of uncongenality on the resulting multiple imputation inference: an offspring of the uncongenial but necessary marriage between Bayesian (via the imputer’s model) and frequentist (via the user’s analysis procedure) machineries. (This is a joint work with Xianchao Xie of Harvard University.)

 
Elicitation
Tony O’Hagan

This presentation will cover some recent research in the area of eliciting probability distributions. Although subjective prior distributions lie at the heart of Bayesian statistics, eliciting them has often received far less attention than it deserves. The first part of the talk will address the process of elicitation, and will illustrate a package called SHELF that is designed to help with conducting an elicitation session in a sound and transparent way. SHELF deals only with eliciting distributions for single variables, and the second part of the talk will look at the issues around multivariate elicitation.

 
Bayesian inference for diffusions
Gareth Roberts

This talk will survey recent advances in the area of Bayesian inference for diffusions, focusing mainly on the case of discretely observed data. Methodology relies heavily on MCMC, and one focus will be on the imputation of unobserved diffusion bridges running between consecutive observations. The use of “exact” (ie free from discretisation error) methods and “perfect exact” methods will also be described.

 
Theory of MCMC: What is it Good For
Jeffrey Rosenthal

MCMC algorithms are widely used in Bayesian analysis, but the Markov chain theory that underlies their validity and effectiveness is often ignored. Is this absence of theory a good thing? A great tragedy? Somewhere in between?

This talk will discuss ways in which theoretical considerations can genuinely improve the application of MCMC algorithms to statistical inference problems. We will consider such topics as: ergodicity via phi-irreducibility; geometric ergodicity; central limit theorems; efficiency orderings; optimal scaling; and adaptive MCMC. The ideas will be illustrated with simple examples, including live java simulations. The emphasis throughout will be on ways in which applied users of MCMC might – and might not – benefit from understanding the theory.

 
 
:: Invited Speaker Abstracts::
 
A Bayesian Semi-Parametric Survival Model with Longitudinal Markers
Kim-Anh Do, Peter Mueller and Song Zhang

We consider inference for data from a clinical trial of treatments for metastatic prostate cancer. Patients joined the trial with diverse prior treatment histories. The resulting heterogeneous patient population gives rise to challenging statistical inference problems when trying to predict time to progression on different treatment arms. Inference is further complicated by the need to include longitudinal marker as a covariate. We develop a semi-parametric model for joint inference on longitudinal data and an event time, with the possibility that some patients are cured. The event time distribution is based on a non-parametric Plya tree prior. For the longitudinal data we assume a mixed effects model. Incorporating a regression on covariates in a non-parametric event time model in general, and for a Plya tree model in particular, is a challenging problem. We exploit the fact that the covariate itself is a random variable. We achieve an implementation of the desired regression by factoring the joint model into a marginal model for the event time and a regression of the longitudinal outcomes on the event time, i.e., we implicitly model the desired regression by modeling the reverse conditional distribution.

 
An adaptive Monte Carlo approach for statistical models with intractable normalizing constants
Yves Atchade

This talk will report on a general adaptive MCMC strategy to handle statistical models with intractable normalizing constants. The method can be thought as an extension of the MCMC-MLE approach (Geyer and Thompson 1992) to handle similar models by maximum likelihood. I will give some applications in image segmentation, social network inference and statistical protein design.

 
Hierarchical Bayesian Functional Data Analysis and its applications
Veera Baladandayuthapani

In many scientific application, one observes responses measured over a grid of discrete values, termed functional observations. In this talk I will talk about a couple of novel functional data analytic methods motivated by real oncology experiments. In the first case, the responses are inherently functional and are hierarchically correlated. In the second case, it of interest to model the effect of a functional covariate on a scalar response.

 
Divergence Based Priors for Objective Bayesian Model Selections
M.J. Bayarri, G. García-Donato

One of the main difficulties for objective Bayesian model selection is that usual objective priors can not be used, since they are improper. They can be used for parameters occurring in all of the models and having the same meaning across models, but otherwie proper priors have to be used. This is true in particular for the ‘extra’ parameter(s) in the alternative hypothesis.

There are several methods to derive objective but proper prior for model selection. One of the most popular is Jeffrey’s proposal, later generalized by Zellner and Siow, for the Normal Linear models context. Less known is that Jeffreys pointed some ideas to extend this prior to general scenarios, perhaps because it was not really pursued, and Jeffrey’s first attempt was not very succesful.

In this talk we follow Jeffrey’s hint and develop (objective) proper prior distributions for hypothesis testing and model selection based on measures of divergence between the competing models; we call them divergence based (DB) priors . DB priors have simple form and are shown to have desirable properties, like information (finite sample) consistency; often, they are similar to other existing proposals like the intrinsic priors; moreover, in normal linear models scenarios, they exactly reproduce Jeffreys-Zellner-Siow priors. Most importantly, in challenging scenarios such as irregular models and mixture models, the DB priors are well defined and very reasonable, while alternative proposals are not. We derive approximations to the DB priors as well as MCMC and asymptotic expressions for the associated Bayes factors, which also reveals interesting connections with other proposals (like the unit information priors). The paper is available at http://ftp.stat.duke.edu/WorkingPapers/06- 23.pdf

 
Analysis of Marked Point Patterns with Spatial and Non-spatial Covariate Information
Bradley P. Carlin, Shengde Liang and Alan Gelfand

The analysis of spatial point process data has historically been plagued by computational difficulties. Likelihoods feature intractable integrals that require approximation. This problem is exacerbated when such models are incorporated in a fully hierarchical framework, since this nests the integrals within a Markov chain Monte Carlo (MCMC) algorithm. We extend customary spatial point pattern analysis in the context of a log-Gaussian Cox process model to accommodate spatially referenced covariates, individual-level risk factors, and individual-level covariates of interest that mark the process. We also use multivariate process realizations to capture dependence among the intensity surfaces across the marks. We illustrate using a collection of breast cancer case locations collected over the mostly rural northern part of the state of Minnesota that are marked by their treatment selection, mastectomy or breast conserving surgery (“lumpectomy”), which is less disfiguring but requires 6 weeks of follow-up radiation therapy. The key substantive covariate (driving distance to the nearest radiation treatment facility) is spatially referenced, but other important covariates (notably age and stage) are not. Our approach facilitates mapping and boundary analysis (“wombling”) of the marginal log-relative intensity surfaces for the two treatment options, and resolves the issue of whether women who face long driving distances are significantly more likely to opt for mastectomy while still accounting for all sources of spatial and nonspatial variability in the data.

 
Bayesian Evaluation of Multiple-regime Nonlinear Volatility Models
Cathy W. S. Chen, Richard H. Gerlach and Ann M.H. Lin

A multiple-regime nonlinear volatility model with a fat-tailed error distribution is discussed. Bayesian estimation and inference is considered for this model, as well as Bayesian posterior model comparison among competing volatility models with different numbers of regimes. An adaptive MCMC sampling scheme is designed whose output achieves these goals. Our modeling framework provides a parsimonious representation of well-known stylized features of financial time series and facilitates statistical inference in the presence of high or explosive persistence and conditional heteroskedasticity. We focus on the three-regime case: the main feature of the model is allowing an explosive volatility regime and capturing mean and volatility asymmetries in financial markets. We illustrate our findings via simulation and an empirical study of eight international oil gas index markets. Most markets strongly support the three-regime model over its competitors.

 
Importance Sampling of Word Patterns in DNA and Protein Sequences
Louis H. Y. Chen

The use of Monte Carlo evaluation to compute p-values of pattern counting test statistics is especially attractive when an asymptotic theory is absent or when the search sequence or the word pattern is too short for an asymptotic formula to be accurate. The drawback of applying Monte Carlo simulations directly is its inefficiency when p-values are small, which precisely is the situation of importance. We provide a general importance sampling algorithm for efficient Monte Carlo evaluation of small p-values of pattern counting test statistics and apply it on word patterns of biological interest, in particular, palindromes and inverted repeats, patterns arising from position specific weight matrices, as well as co-occurrences of pairs of motifs. We also show that our importance sampling technique satisfies a log efficient criterion. This paper is a joint work with Hock Peng Chan and Nancy Ruonan Zhang.

 
How Bayes is changing environmental science: Application to climate change and the biodiversity paradox
Jim Clark

The combined advantages of graphical modeling and Bayesian inference are transforming environmental science. Ecological processes are highly scale- and setting- dependent and only indirectly related to observations. Ecological data are remarkably heterogeneous, ranging from photosynthetic rate measurements on leaves to remote sensing of landscapes. By modern standards, the problems are not overwhelmingly largethey are frustratingly complex. I illustrate some of the potential to address complex relationships, together with new issues that emerge, when modern tools are applied to these dynamic, highly connected systems. The application concerns a long-standing effort to understand controls on forest diversitydescribed a half century ago as a paradoxin the context of contemporary rapid climate change. Theoretical models tell us that species must differ in specific ways in order to coexist as stable ecological communities. These differences must involve tradeoffs among species to insure that the best competitors do not drive all others to extinction. Yet many coexisting species do not appear to possess such differences. The lack of observable tradeoffs presents a paradox when taken in light of the fact that species do indeed coexist in nature. A key challenge involves identifying the important differences among species that allow each to persist in the face of competition from many others. In this talk I discuss why inconsistent assumptions of theoretical and statistical models lead to the paradox. I suggest that coexistence is best understood in terms of population heterogeneity, an old idea, but one that has not been properly interpreted from data or theory. First, species differences occur along many axes and are missed by traditional data modeling approaches. Second, theoretical (stochastic) models contain unrecognized species differences, which are simply hidden from view. I show how a proper treatment of heterogeneity resolves the paradox. Species differences responsible for coexistence are high-dimensional, and can be captured in data and in models by admitting an appropriate structure for unknowns. Much of the unexplained variation in data results from differences among individuals. It occurs across a large number of axes, most of which will be hard to capture in simple experiments and observational data sets. By providing a consistent treatment of information from many scales and complex, interacting processes, modern Bayes allows a more integrated view. The example involves parameterization of the joint distribution of demographic rates (fecundity, dispersal, growth, mortality) for all of the trees in selected forests as coupled, non-linear state space models that accommodate the interactions among individuals and with their local environments. We are fitting the forest and the trees. Prediction of biodiversity response to climate change involves mixing over all sources of heterogeneity, allowing us to better anticipate vulnerabilities. Specific issues include ways to organize an analysis that can involve updating at several stages, how to weight information deriving from multiple data sources, and efficient computation when conditional relationships are complex and spatio-temporal.

 
Bayesian nonparametric variable selection and hypothesis testing
David Dunson

In a broad variety of applications, interest focuses on assessing the relationship between one or more response variables, Y, and predictors, X. Our interest focuses on Bayesian methods for selecting predictors having any effect on the conditional response distribution of Y without incorporating parametric assumptions, either in the mean or the residual distribution. We first describe a variety of nonparametric priors for the conditional distribution, including dependent Dirichlet processes and kernel stick-breaking processes. These priors are then adapted to allow uncertainty in the subset of the predictors to be included in the model, while also allowing hypothesis testing. To accommodate one-sided hypothesis testing and estimation problems in the nonparametric paradigm, we also describe methods for incorporating stochastic ordering constraints with continuous and/or categorical predictors. Theoretical properties are discussed briefly and efficient MCMC algorithms are developed for posterior computation. The methods are illustrated using epidemiologic applications.

 
Clustering and Variable Selection using Fisher distributions
Yanan Fan

High dimensional data such as those found in data mining and microarray gene expression experiments are often inherently directional. We present a novel approach to model-based clustering of high dimensional data via the use of a mixture of Fisher distributions. The proposed method carries out simultaneous variable selection and clustering. The resulting clustering depends on the amount of correlation between observations given the selected variables. A Bayesian approach is adopted, where the determination of the number of clusters, cluster allocation and variable selection is carried out simultaneously via the use of trans-dimensional Markov chain Monte Carlo.

 
Dynamic Nonlinear Quantile estimation with the Asymmetric Laplace Distribution
Richard Gerlach and Cathy W.S.Chen

Value-at-Risk (VaR) forecasting is required by all financial institutions via the Basel II Capitol Accord. The VaR is simply proportional to a quantile in the relevant forecast distribution. Engle and Manganelli (2004) proposed to use quantile regression to model the VaR directly, introducing the Conditional Autoregressive Value at Risk (CAViaR) model. Recent work shows that such dynamic quantile estimation is a special case of maximising the asymmetric or skewed Laplace distribution likelihood. The question then arises as to the feasibility of extending this result to likelihood, and then Bayesian, inference; a question which has been partially answered in cross-sectional regression models by Yu and Moyeed (2001) and Geraci and Bottai (2007). We extend this work by designed an adaptive MCMC sampling scheme, combining random walk and independent Metropolis-Hastings methods, to provide and assess parameter estimates and inference for CAViaR-type models, exploiting the skewed Laplace pseudo-likelihood framework. Further, we extend the CAViaR framework to include more flexible nonlinear models, e.g. to better capture asymmetry in financial markets. Simulation results show promising results. Finally we apply our methods to a range of international stock market indices and provide a thorough comparison with modern symmetric and nonlinear GARCH-type models, as well as the popular RiskMetrics method, in terms of forecasting the VaR, and quantiles in general, dynamically.

 
Models and Inference for Musical Structure Analysis
Simon Godsill, Taylan Cemgill, Paul Peeling and Nick Whitely

In this paper we describe recent advances in computer understanding of musical audio signals. The objectives of the work are to extract high-level and meaningful information from musical audio recordings in the form of such things as musical pitch, timbre, timing and instrument identity. These tasks are of use in themselves, but they also feed into other related tasks such as automated remixing, source separation and score-based alignment. This is a highly complex class of problems, and one which can currently only be performed accurately by trained musicians. In our research we propose Bayesian hierarchical models which represent (at the highest level) the musical score and at the lowest level generative models for the measurement of raw audio data, as captured and digitised through one or more microphones. The models attempt to capture the dynamics of the musical score in a generic and musically meaningful fashion, favouring both likely transitions over time and likely groupings of notes at a given time. These models are connected to lower level signal models, based on either expensive time-domain oscillator models or cheaper point process and Gaussian models in the time-frequency plane. Inference is of course a complex task and we discuss adaptations of MCMC, variational methods and sequential Monte Carlo work. For links to our recent work in this area see www-sigproc.eng.cam.ac.uk/˜sjg and www-sigproc.eng.cam.ac.uk/˜atc27.

 
Explorations in and with sparse latent factor analysis
Peter J. Green and Sylvia Richardson

Classical latent factor analysis seeks to discover patterns of dependence in multivariate data that allow dimension reduction through the representation of the observed variables as linear combinations of a smaller number of unobserved 'factors'. We are interested in finding sparse representations, in which there are many zero coefficients among the linear coefficients, in the interests of parsimony, interpretability, and statistical stability; we use a Bayesian hierarchical modelling approach.

Specifically, we examine the situation where there are two or more groups of variables, neither low in dimension, and the main interest is in discovering sparse representations of the dependence between them. We develop a strategy that structures the patterns of dependence, and explores the model space allowing the numbers of common and specific factors to vary.

We are motivated by a study relating profiling of metabolites with transcript and enzyme activity, and illustrate the statistical and computational performance of our methodology, and its sensitivity to prior assumptions, on both these data and a variety of simulated set-ups.

 
Multilevel adaptive sampling for inverse problems
Dave Higdon, Dave Moulton and Todd Graves

Over the past few decades, efficient and robust multilevel solvers have been developed for a variety of applications which range from medical tomography to flow in porous media. Recent success of these multilevel solvers is due to the development of general multiscale concepts such as operator-induced variational coarsening. This approach implicitly treats the multiscale aspects of the fine-scale model in its generation of successively coarser representations.

Clearly such solvers can be used as a “black box” (within an MCMC scheme, for example) for inferring unknown parameters or initial conditions in inverse problems. While computational efficiency has been the primary motivation for the development of such multilevel solvers, it is hard to resist the temptation of prying into these solvers so that the coarser representations can be used to help guide the posterior sampling. In this talk we explore sequential and Markov chain Monte Carlo methods for exploiting the implicit coarsened representations within a mulitlevel solver to speed up posterior sampling.

 
Pooling information across matrix decompositions
Peter Hoff

One approach to summarizing relational and other matrix-valued data is with low- rank matrix approximations. For example, the variation among the entries of a symmetric n×n data matrix Y is often expressed with the eigenvalue decomposition model Y ~ U ΛUT +E, where U is an n×k orthonormal matrix and Λ is a diagonal matrix. In this work we consider pooling information across multiple such data matrices Y (1), . . . , Y (p) for situations in which the common cells across matrices {y(1)i,j , . . . , y(p)i,j } represent repeated or multivariate measurements under a common set of conditions {i, j}. This is accomplished by estimating the parameters in a model for the variability among the orthonormal eigenvector matrices U(1), . . . ,U(p) of the p data matrices. The model is based on a variation of the matrix Langevin distribution, for which estimation is accomplished primarily with Gibbs sampling. The methodology is applied to the analysis of multivariate relational data where {y(1)i,j , . . . , y(p)i,j } represent multiple dyadic relations between nodes i and j.

 
Inferring regions of copy number variation (CNVs) in human DNA from SNP genotyping data using objective Bayesian signal processing methods
Chris Holmes and Chris Yau

Recent discoveries suggest that regions of copy number variation (CNVs) in the human genome are much more widespread than previously thought. A CNV is defined as a segment of DNA > 1 kb that is present at a variable copy number in comparison to a reference genome. It is believed that up to 10% of the human genome maybe copy number variable (contributing to around 10% of genetical transcription variation) and copy number polymorphisms have been linked to a number of diseases. In recent work we have developed an objective Bayesian HiddenMarkov model to detect regions of copy number variation from genome-wide single nucleotide polymorphism (SNP) genotyping data (of around 500,000 SNPs). In our model the hidden states refer to unobserved copy number variants at a locus (SNP) and the transitions between states capture the persistence within CNV states across chromosomal regions. In certain samples, such as from tumour biopsies, tissue heterogeneity introduces additional complications requiring a mixture deconvolution. Predictions from the model have been experimentally validated on a number of samples. We report the findings from a number of large studies including 1500 samples from the 1958 UK birth cohort and a genome-wide association study of childhood malaria risk in an African population. This is joint work with the Wellcome Trust Centre for Human Genetics.

 
Flexible Multivariate Regression Density Estimation
Robert Kohn

We develop flexible multivariate regression density estimators and use them for modeling multivariate continuous and discrete data.

 
A Flexible Approach to Parametric Inference in Nonlinear Time Series Models
Gary Koop and Simon Potter

Many structural break and regime-switching models have been used with macroeconomic and financial data. In this paper, we develop an extremely flexible parametric model which can accommodate virtually any of these specifications - and does so in a simple way which allows for straightforward Bayesian inference. The basic idea underlying our model is that it adds two simple concepts to a standard state space framework. These ideas are ordering and distance. By ordering the data in various ways, we can accommodate a wide variety of nonlinear time series models, including those with regime-switching and structural breaks. By allowing the state equation variances to depend on the distance between observations, the parameters can evolve in a wide variety of ways, allowing for everything from models exhibiting abrupt change (e.g. threshold autoregressive models or standard structural break models) to those which allow for a gradual evolution of parameters (e.g. smooth transition autoregressive models or time varying parameter models). We show how our model will (approximately) nest virtually every popular model in the regime-switching and structural break literatures. Bayesian econometric methods for inference in this model are developed. Because we stay within a state space framework, these methods are relatively straightforward, drawing on the existing literature. We use artificial data to show the advantages of our approach, before providing two empirical illustrations involving the modeling of real GDP growth.

 
Treed Gaussian Processes and Pattern Search Optimization
Herbie Lee and Matt Taddy

Asynchronous Parallel Pattern Search is a derivative-free numerical optimization algorithm. We show how to combine pattern search with Treed Gaussian Processes (TGP) to produce a more robust method for maximization or minimization by incorporating information about our uncertainty. The TGP emulator can also provide sensitivity analysis and a probabilistic analysis of convergence. Our approach is particularly useful for physical experiments or complex computer modeling problems, where each new datapoint or function evaluation may be expensive to obtain. We demonstrate results on both synthetic and real data.

 
NonParametric Measurement of A Time-Varying Volatility Risk Premium: A Bayesian Particle Filtering Approach
Nan Qu, Gael M. Martin and Catherine S. Forbes

Non-parametric measures of spot and option-implied volatility are used to extract real-time estimates of a time-varying volatility risk premium. The inferential method is Bayesian, with particle filtering used to cater for the non-linearities in the state space specification.

 
Fully Non-parametric Bayesian Ensemble Modelling
Hugh A. Chipman, Edward I. George and Robert E. McCulloch

Suppose we would like to learn the relationship between y and a high dimensional vector x based on a limited number of observations. In "BART: Bayesian Additive Regression Trees" (2006), Chipman, George and McCulloch develop a fully Bayesian approach for discovering and drawing inference about an unknown function ƒ based only on assuming y = ƒ(x)+ε with iid normal errors. In the spirit of "ensemble models", BART approximates ƒ by a sum of many simple regression tree models, each of which are kept small with a strong regularization prior. In terms of out-of-sample prediction, BART’s performance compares favorably with competing methods. Posterior evaluation by a well-mixing MCMC algorithm allows for the natural Bayesian quantification of uncertainty about ƒ. Further, the modular nature of BART facilitates its embedding within larger hierarchical models (for example, see Zhang, Shih and Mueller 2006).

In this work, we further extend the flexibility of the BART approach by relaxing the simple iid normal error specification and replacing it with a Dirichlet process model for the errors. Various specification and prior choices are explored. The costs as well as the benefits of this more flexible approach are illustrated.

 
Adaptive Multiple Importance Sampling: AMIS Algorithm
Antonietta Mira

The strength of AMIS resides in its completely adaptive and multi-purpose nature: no tuning parameter is needed and the same algorithm is proved to perform well on very diverse high-dimentional target distributions (from banana shaped to mixture with very well separated modes and tunnels).

The algorithm has both a temporal, T, and a population, N, dimention and consists of 3 steps: initialization, adaptation and clustering. The AMIS estimator is obtained by recycling the N X T particles generated in all 3 steps, with the corresponding importance weights. What drives AMIS and ensures unbiasedness, are importance sampling type of reasonings.

Variance reduction is achieved by Rao-Blackwellisation and by a novel way of combining different importance distributions by a deterministic mixture and an actualization process performed on the weights. As a byproduct of these processes, all particles are on the same \weighting scale" and can be easily and efficiently combined to get the final AMIS estimator.

In the first step of the algorithm a good scale parameter for the initial importance distribution is found using the effective sample size of the importance weights. Global adaptation of the mean and variance of the importance distribution to the corresponding target parameters (2nd step) is combined with local adaptation achieved via a Rao-Blackwellised clustering algorithm (3rd step).

In the talk the AMIS algorithm will be presented, together with examples of its application and performance evaluations/comparisons.

This is joint work with Christian Robert and Jean-Michel Marin.

 
Old and new auxiliary variable methods for Metropolis-Hastings algorithms for distributions with intractable normalizing constants
Jesper Møller and Robert Reeves

Suppose that we want to simulate from the posterior density for the parameter θ given data y, with prior p(θ) and likelihood fθ(y) = hθ(y)/cθ, where the normalizing constant cθ is intractable. Thus the posterior density

p(θ|y) ∝ p(θ)hθ(y)/cθ

is not computable. In an ordinary Metropolis-Hastings algorithm for drawing samples from the posterior distribution the acceptance probability depends on the “unknown” ratio of normalizing constants cθ/cθ'. Most methods to date have used various approximations to estimate or eliminate such ratios of normalizing constants. In Møller et al. (Biometrika, 2006, pages 451-458) we present a new Metropolis- Hastings algorithm for drawing samples from the posterior distribution without approximation. It is called the auxiliary variable method, since we extend the posterior distribution by introducing a certain auxiliary variable so that the acceptance probability can be computed. The auxiliary variable method is a nice application example of perfect simulation algorithms, and it has e.g. been used for Bayesian analysis of Gibbs models (Markov random fields and Markov point processes). Moreover, the auxiliary variable method has more recently been modified and extended to more efficient MCMC algorithms.

 
Bayesian Inference for High Dimensional Functional and Image Data using Functional Mixed Models
Jeffrey S. Morris

High dimensional, irregular functional data are increasingly encountered in scientific research. For example, MALDI-MS yields proteomics data consisting of one- dimensional spectra with many peaks, 2D gel electrophoresis and LC-MS yield two- dimensional images with spots that correspond to peptides present in the sample, and array CGH or SNP chip arrays yield one-dimensional functions of copy number information along the genome. In this talk, I will discuss how to identify candidate biomarkers for various types of proteomic and genomic data using Bayesian wavelet-based functional mixed models. This approach models the functions in their entirety, so avoid reliance on peak or spot detection methods. The flexibility of this framework in modeling nonparametric fixed and random effect functions enables it to model the effects of multiple factors simultaneously, allowing one to perform inference on multiple factors of interest using the same model fit, while adjusting for clinical or experimental covariates that may affect both the intensities and locations of the peaks and spots in the data. I will demonstrate how to identify regions of the functions that are differentially expressed across experimental conditions, in a way that takes both statistical and clinical significance into account and controls the Bayesian false discovery rate to a pre-specified level. Time allowing, I will also demonstrate how to use this framework as the basis for classifying future samples based on their proteomic and genomic profiles in a way that can also combine information across multiple sources of data, including proteomic, genomic, and clinical. These methods will be applied to a series of proteomic and genomic data sets from cancer-related studies.

 
Sensitivity of inference in Bayesian networks to assumptions about founders in forensic genetics
Peter Green and Julia Mortera

Bayesian networks, with inferences computed by probability propagation methods, offer an appealing practical modelling framework for structured systems involving discrete variables in numerous domains, including forensic genetics. However, when standard assumptions are violated - for example when allele frequencies are unknown, there is identity by descent or the population is heterogeneous, dependence is generated among founding genes, that makes exact calculation of conditional probabilities by propagation methods less straightforward. Here we illustrate different methodologies for dealing with these problems by assessing sensitivity to assumptions about founders in forensic genetics problems. These methods comprise constrained steepest descent, linear fractional programming and representing dependence by structure. We illustrate these methodologies on several real case-work forensic genetics examples comprising criminal identification, simple and complex disputed paternity and DNA mixtures.

 
A Bayesian approach to model a multi-state Markov model from interval-censored data
Paul J Mostert and Chris J.B. Muller

In studies of disease stages and their relation to survival, data are usually obtained at infrequent time points during follow-up. At these points, the clinical status of a patient can be assessed and as a consequence be distinctly categorised using other covariates and in many cases a subjective clinical classification by a medical researcher. In its simplest form these categories can be dead or alive or even extended to, for example, stage I, II, III or IV of HIV/AIDS by using clinical markers. Actual changes of the clinical stages occur normally between two successive follow-up times. A disadvantage of not taking this censoring into consideration in model building, may lead to severe over- or underestimation of the actual time spent in the different stages. The time patients stay in the different stages of a disease can be an indication of the effectiveness of a drug in stemming the spread of the disease. A Markov model is assumed to assess the rate at which patients move from one stage to another, given a set of covariates during follow-ups. A Bayesian approach is followed to model the transition states between actual time points, using a Dirichlet process. This Bayesian approach involves that the probability element corresponding to each patient’s contribution to the likelihood is altered according to a Dirichlet process prior. Different approaches in altering the probability contributions are proposed and compared by means of posterior analyses of the transition rates in a non-parametric setting. A paediatric HIV dataset obtained from a large academic hospital in South Africa is used to illustrate the results.

 
Bayesian Synthesis
Q. Yu, S.N. MacEachern and M. Peruggia

Bayesian model averaging enables one to combine the disparate predictions of a number of models in a coherent fashion, leading to superior predictive performance. The improvement in performance arises from averaging models that make different predictions. In this work, we tap into perhaps the biggest driver of different predictions—different analysts—in order to gain the full benefits of model averaging. In a standard implementation of our method, several data analysts work independently on portions of a data set, eliciting separate models which are eventually updated and combined through Bayesian synthesis. The methodology helps to alleviate concerns about the sizeable gap between the foundational underpinnings of the Bayesian paradigm and the practice of Bayesian statistics.

We provide theoretical results that characterize general conditions under which data- splitting results in improved estimation which, in turn, carries over to improved prediction. These results suggest general principles of good modeling practice. In experimental work we show that the method has predictive performance superior to that of many automatic modeling techniques, including AIC, BIC, Smoothing Splines, CART, Bagged CART, Bayes CART, BMA, BART and LARS. Compared to competing modeling methods, the data-splitting approach 1) exhibits superior predictive performance for real data sets and simulations; 2) makes more efficient use of human knowledge; 3) selects sparser models with better explanatory ability and 4) avoids multiple uses of the data in the Bayesian framework.

 
Bayesian Inference of the surviving number of motor neurons for Motor Neuron Disease patients
A.N. Pettitt, P.G. Ridall and Clare McGrory

This talk desribes the challenges of inference for the remaining number of motor neurons for suffers of neurological diseases such as Motor Neuron Disease. In Ridall et al (2006, 2007) we descibe a stochastic model for the firing of motor neurons in a muscle of the leg or arm when the muscle nerve is subject to an electical stimulus. For a series of increasing electrical stimulii to the nerve the neuro-musclular response is measured as a series of electical currents, giving the so-called the response curve, where the amplitude of the response current is the summation of the output currents of units which are firing as a result of the stimulus. Units can fire probabilistically, or always or never for a given input stimulus. The consequent response can be modelled (in a simplified form) by a so-called mixture of mixtures given by the distribution of Z1X1 + Z2X2 + . . . + ZNXN where the Z are independent Bernoulii random variables with means depending on the applied stimulus and the X are independent normal random variables with differing means not dependent on the stimulus. The main focus is on inference for N, the unknown number of motor units or number of components in the mixture of mixtures. In Ridall et al (2007) we used RJMCMC to make inferences for N. This talk will consider approaches to improve the RJMCMC algorithm, how the RJMCMC output from a sequence of studies on a patient can be used to make inferences about the nature of the underlying mechanism of neuron death and score between mechanisms, and how sequential Monte Carlo for static problems can be utlised to estimate the value of the unknown N.

 
Particle Filtering and Learnings
Nick Polson

Particle filtering and learning algorithms for general state space models are developed. Our approach exactly samples from a particle approximation to the joint posterior distribution of both parameters and latent states. We illustrate the effciency of our approach in a number of models: robust filtering including quantile models, t-errors, Cauchy and Meridian errors; stochastic volatility jump diffusions. Robustness in both the observation and state equation is easily accommodated together with parameter learning. An application to stochastic volatility jump models for stock index return data is described.

 
Structured AR multi-processes for detecting cognitive fatigue from multiple brain signals
Raquel Prado

Mental fatigue is one of the main causes of human performance failures, leading to accidents in vehicle operation, air traffic control and space missions. Therefore, automatic detection of early signs of mental fatigue is key for increasing safety and human performance in many scenarios.

Electroencephalograms (EEGs) are considered the most informative signals for monitoring mental fatigue among several other physiological and behavioral measures available. We analyze multiple EEG signals recorded in subjects who performed continuous mental arithmetic for a long period of time, which led to severe cognitive fatigue. In particular, we analyze the signals using a multi-process approach in which each of the processes is an autoregression. We impose structured prior distributions that take into account the latent components underlying each autoregressive process. These priors allow us to incorporate relevant information about the components that may characterize various mental states of alertness. We discuss issues related to on-line filtering and automatic detection of fatigue from multi-channel data.

 
A Bayesian nonparametric approach for analysing and testing clustering structures
Antonio Lijoi, Ramsés H. Mena, Igor Prünster and Stephen G. Walker

Many applications require a deep understanding of the clustering mechanism that generates the observed data. The two parameter Poisson–Dirichlet process and more general Gibbs–type priors are natural candidates for modelling data arising from discrete distributions. Here we make use of such priors and analyze their posterior behaviour in some detail. In particular, we propose methods for prediction and testing in order to assess the clustering structure featured by the data. The methodology is then applied to Expressed Sequence Tags (ESTs) data in genomics. Indeed, when studying EST data one is typically interested in evaluating the redundancy of the corresponding cDNA library and in comparing different libraries on the basis of their ability to generate new distinct genes. Our proposal has appealing properties over frequentist nonparametric methods, which become unstable when prediction is required for large future samples.

 
Bayesian analysis of high dimensional data
Sylvia Richardson, Leonardo Bottolo, Peter Green

In parallel to fast evolving technology that give rise to high dimensional data in many set-ups, there is a lot of interest in searching for sparse structure in such high dimensional data sets. In the first part of the talk, I shall discuss models and algorithms based on parallel tempering and Evolutionary Monte Carlo for Bayesian variable selection in the large p, small n paradigm. In the second part, I shall focus attention on dimension reduction through sparse latent factor models. Models and methods will be illustrated by examples from the field of genomics.

 
Non-parametric dynamic modelling of biological time series
Fabio Rigat

This talk illustrates the theory and application of a novel sequential non-parametric method for estimating the dynamics of time series models. This method provides a robust alternative to Bayesian state-space models which does not involve any parametric assumption on the form of the evolution of a model’s parameters. Their dynamics are assessed within a hypothesis testing framework as a change-point problem. The Kullback-Leibler divergence between the posterior distributions of different sets of data under the same model is proposed as a test statistic. Posterior simulation is used to approximate the value of the KL divergence and its critical region under the null hypothesis of no change.

The main motivation of this work is to estimate the molecular and functional dynamics of biological systems using high throughout technologies such as microarrays and multi-electrode arrays. In this context, robust dynamic stochastic models are fundamental tools because little is known about the mechanisms regulating the evolution of many biological processes. This talk focuses mainly on the estimation of neuronal functional dynamics using multiple spike trains recorded in-vitro and in-vivo. The neuronal dynamics are shown to explain some aspects of a simple decision process and the onset of movements.

 
Multivariate emulation of high-dimensional model output
Jonathan Rougier

“Emulation” is the statistical modelling of a complex deterministic function, usually a computer code. When the code simulates a physical process, the outputs are often high-dimensional, taking the form of collections of values of the same type (e.g., sea-surface temperatures indexed by space and time). Typically, a given collection is smooth, and its components ought to be modelled jointly. However, assimilating the large amount of data (the product of the number of model evaluations and the number of outputs) is computationally challenging. A new approach, the “outer- product emulator”, solves this problem. I describe the outer-product emulator, and illustrate its use with a climate model.

 
Combining expert opinions in prior elicitation
Judith Rousseau

We consider the problem of combining expert opinions in the process of eliciting a prior distribution using these expert opinions. The idea is to consider a hierarchical parametric modeling. Each expert has its own hyperparameter and the hyperparameters are linked using a parametric model constructed using extra knowledge on the experts and their opinions. Two examples are considered.

 
Discrete multivariate mixture distributions for spatial models
Alexandra M. Schmidt and Jennifer A. Hoeting

In this talk we propose models for multivariate count data which are spatially correlated and present over-dispersion. In other words, the model must capture the covariance structure within and among locations. There are different ways, in the literature, of defining a multivariate Poisson distribution. We discuss these different approaches and consider two situations: a multivariate Poisson hierarchical model with spatial random effects, and a multivariate negative binomial distribution, which can also be defined in different ways, based on the multivariate Poisson. Inference is performed under the Bayesian paradigm. As the posterior distribution does not have a closed form, MCMC techniques are used to obtain a sample from the posterior. We discuss the properties of our proposed models on artificial data sets and also on a real application.

 
Think locally, act globally: Combining the best features of particle methods and MCMC
Michael Johannes, Nicholas Polson, and Steven L. Scott

This talk describes an algorithm for simulating from an arbitrary probability distribution π by combining features from Markov chain Monte Carlo (MCMC) and particle based methods. MCMC explores the sample space of π by a series of local moves. Particle methods simulate from π by first sampling many particles from a proposal distribution, then resampling the particle with appropriately chosen weights. Particle methods are limited by the need to choose an appropriate proposal distribution, but they have attractive global search properties that MCMC lacks (they have no difficulty locating multiple modes, for example).

Our algorithm, which we call Particle Posterior Sampling, runs n Markov chains in parallel. The state of each Markov chain at time t can be thought of as a particle, and the collection of particles define an empirical distribution which can be resampled. The result is an algorithm where particles move quickly to high probability regions, then use MCMC’s local search capabilities to quickly explore those regions. The resulting trade off is that if n is large then t can be kept small (e.g. single digits). The worst-case behavior of the algorithm is that of n parallel Markov chains with a common starting point near the mode.

We illustrate the algorithm on several canonical problems where MCMC is known to struggle. These include carefully chosen examples from probit regression and Gaussian linear models with “spike and slab” prior distributions, as well as finite mixture models with an unknown number of components.

 
Improving the efficiency of likelihood-free computation
Scott A. Sisson

In recent years there there has been considerable interest in Bayesian applications where the likelihood function is computationally intractable. Most of these applications, and therefore the methods development, has occurred outside of mainstream Statistics publications, primarily in population genetics and epidemiology.

In this presentation we outline the basic idea behind “likelihood-free” Bayesian computation. Following this setup we demonstrate that while certain method specifications are arbitrary in theory (in that the correct model is realised asymptotically regardless of their specification), in practice they can have an overwhelming influence on the efficiency of the computation. We propose simple methods to automate model setup and improve the efficiency of its implementation, and illustrate them through real analyses.

 
Multivariate GARCH Models with Correlation Clustering
Mike K.P. So and Iris W.H. Yip

This paper proposes a clustered correlation multivariate GARCH model (CC-MGARCH) which allows the conditional correlations to form clusters where each cluster follows the same dynamic structure. One main feature of our model is to form a natural grouping of the correlations among the series while generalizing the time-varying correlation structure proposed by Tse and Tsui (2002). To estimate our proposed model, we adopt Markov Chain Monte Carlo methods. Forecasts of volatility and value at risk can be generated from the predictive distributions. The proposed methodology is illustrated using simulated and financial market data.

 
Bayesian Computation, Non-Linear Dynamic Models & Cellular Networks
Mike West, Jarad Niemi, Lincgong You and Chee-Meng Tan

The development of effective methods of Bayesian computation for inference in non-linear dynamic models remains a challenge and one of growing importance in areas such finance and systems biology. This talk will discuss advances in developing and applying Metropolis methods in such problems, where inference involves high-dimensional latent states as well as hyper-parameters. Research in single cell systems biology, where non-linear dynamic models of cellular networks arise and are copied over thousands of cells, provide motivating examples and context.