In hierarchical Bayesian models, there's often a high degree of correlation between parameters at different levels of the model. To give an example, if I have the following hierarchical model:

then \(\eta_j\) and \(\beta\) might be strongly correlated. The correlation, of course, makes MCMC sampling inefficient. This parameterization is often called 'centered' because the distribution of \(\eta_j\) is said to be 'centered' over \(x_j \beta\). A common solution is to 'un-center' the model. A typical reparameterization would be:

Sometimes the un-centered version of the model is better than the centered, but that's not always the case. A nice paper by Yaming Yu and Xiao-Li Meng (JCGS, 2011) addresses this problem. Their method builds upon the intuition that we might do well to sample from both parameterizations of the model; that is, to alternately draw \(\eta_j\) from the centered and un-centered models within each MCMC iteration.

Yu and Meng's method combines the second and third steps above, so that we have, at each iteration, the following procedure:

What's nice (i.e. convenient) is that the distribution \(\tilde\eta_j | \eta_j, \beta\) is usually degenerate. In this example, \(\tilde\eta_j = \eta_j - x_j\beta\).

I've implemented their method and it seems to do a good job breaking some of the correlation between parameters. Their paper includes a number of theoretical results showing how awesome this procedure is (e.g., it's never worse than the less efficient of the two parameterizations, and it can converge in cases where neither parameterization by itself will).

Update: I continue to be amazed at this technique. I plan to use this as my default sampling scheme for hierarchical models.

comments powered by Disqus