class: center, middle, inverse, title-slide # STA 360/602L: Module 7.4 ## Metropolis within Gibbs ### Dr. Olanrewaju Michael Akande --- ## Combining Metropolis and Gibbs - It is often the case that full conditionals are available for some parameters but not all. -- - Very useful trick is to combine Gibbs and Metropolis. -- - We will illustrate this by analyzing time series data on global warming. -- --- ## Carbon dioxide and temperature - Data are from analysis of ice cores from East Antarctica -- - Temperature (recorded in terms of difference from current present temp in degrees `\(C\)`) and `\(\text{CO}_2\)` (measured in ppm by volume) are standardized to have mean `\(0\)` and variance `\(1\)`. -- - `\(200\)` values, each roughly `\(2000\)` years apart. -- - `\(\text{CO}_2\)` values matched with temperature values roughly `\(1000\)` years later. --- ## Data <img src="7-4-metropolis-within-gibbs_files/figure-html/unnamed-chunk-2-1.png" style="display: block; margin: auto;" /> `\(\text{CO}_2\)` and temperature follow similar patterns over time. --- ## Inference - Interest lies in predicting temperature as a function of `\(\text{CO}_2\)`. -- - In these data, the error terms are temporally correlated so that a reasonable model for temperature is .block[ `$$\boldsymbol{Y} \sim \mathcal{N}_n(\boldsymbol{X}\boldsymbol{\beta}, \Sigma),$$` ] -- where `\(\boldsymbol{X}\)` contains a column for the intercept plus a column for `\(\text{CO}_2\)`, and `\(\Sigma\)` has a first-order autoregressive structure so that: .small[ $$ \Sigma = \sigma^2 \boldsymbol{C}_\rho = \sigma^2 `\begin{bmatrix} 1 & \rho & \rho^2 & \ldots & \rho^{n-1} \\ \rho & 1 & \rho & \ldots & \rho^{n-2} \\ \rho^2 & \rho & 1 & \ldots & \rho^{n-3} \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ \rho^{n-1} & \rho^{n-2} & \rho^{n-3} & \ldots & 1 \\ \end{bmatrix}` $$ ] -- - The covariance model assumes constant variance but a decreasing correlation as the time between temperature measures is greater. --- ## Posterior inference - We need to specify prior distributions for `\(\boldsymbol{\beta}\)`, `\(\sigma^2\)` and `\(\rho\)`. -- - If we assume .block[ `$$\pi(\boldsymbol{\beta}) = \mathcal{N}_p(\boldsymbol{\mu}_0, \Lambda_0),$$` ] -- then .block[ $$ `\begin{split} \pi(\boldsymbol{\beta} | \boldsymbol{y}, \boldsymbol{X}, \sigma^2, \rho) & = \ \mathcal{N}_p(\boldsymbol{\mu}_n, \Lambda_n),\\ \end{split}` $$ ] -- where .block[ $$ `\begin{split} \Lambda_n & = \left[\Lambda_0^{-1} + \frac{1}{\sigma^2} \boldsymbol{X}^T \boldsymbol{C}_\rho^{-1}\boldsymbol{X} \right]^{-1}\\ \\ \boldsymbol{\mu}_n & = \Lambda_n \left[\Lambda_0^{-1}\boldsymbol{\mu}_0 + \frac{1}{\sigma^2} \boldsymbol{X}^T \boldsymbol{C}_\rho^{-1}\boldsymbol{y} \right]. \end{split}` $$ ] --- ## Posterior inference - If we assume .block[ `$$\pi(\sigma^2) = \mathcal{IG} \left(\frac{\nu_0}{2}, \frac{\nu_0\sigma_0^2}{2}\right),$$` ] -- then .block[ $$ `\begin{split} \pi(\sigma^2 | \boldsymbol{y}, \boldsymbol{X}, \boldsymbol{\beta}, \rho) & = \mathcal{IG} \left(\dfrac{\nu_n}{2}, \dfrac{\nu_n\sigma_n^2}{2}\right), \\ \end{split}` $$ ] -- where .block[ $$ `\begin{split} \nu_n & = \nu_0 + n\\ \\ \sigma_n^2 & = \dfrac{1}{\nu_n} \left[ \nu_0 \sigma_0^2 + (\boldsymbol{y} - \boldsymbol{X}\boldsymbol{\beta})^T \boldsymbol{C}_\rho^{-1} (\boldsymbol{y} - \boldsymbol{X}\boldsymbol{\beta}) \right] = \dfrac{1}{\nu_n} \left[ \nu_0 \sigma_0^2 + \text{SSR}(\boldsymbol{\beta},\rho) \right]. \end{split}` $$ ] -- - Therefore, given `\(\rho\)`, we can use Gibbs sampling to cycle through the full conditionals for `\(\boldsymbol{\beta}\)` and `\(\sigma^2\)`. --- ## Posterior inference - Next, we need a prior for the correlation `\(\rho\)`. There is no semi-conjugate option here. -- - Since we expect `\(\rho\)` to be positive, we could use `\(\pi(\rho) = \text{Unif}(0,1)\)`. -- - Unfortunately, this does not lead to a standard full conditional. -- - However, we can use Metropolis-Hastings for the resulting full conditional for `\(\rho\)`. Actually, if we could come up with a symmetric proposal for `\(\rho\)`, we can just use the Metropolis algorithm. -- - So, technically, we have a Gibbs sampler since we will cycle through full conditionals. However, the sampling step for `\(\rho\)` will rely on Metropolis. -- - Therefore, we have a .hlight[Metropolis within Gibbs] sampler. --- ## Posterior inference - Update for `\(\rho\)` (Metropolis) at iteration `\((s+1)\)`: 1. Generate a _candidate_ value `\(\rho^\star \sim \text{Unif}(\rho^{(s)} - \delta, \rho^{(s)} + \delta)\)`. If `\(\rho^\star < 0\)`, reassign as `\(|\rho^\star|\)`. If `\(\rho^\star > 1\)`, reassign as `\(2-\rho^\star\)`. *I leave the proof that this "reflecting random walk" is symmetric to you.* -- 2. Compute the acceptance ratio .block[ .small[ $$ `\begin{split} r & = \frac{p(\boldsymbol{y} | \boldsymbol{X}, \boldsymbol{\beta}^{(s+1)}, \sigma^{2(s+1)}, \rho^\star) \cdot \pi(\rho^\star)}{p(\boldsymbol{y} | \boldsymbol{X}, \boldsymbol{\beta}^{(s+1)}, \sigma^{2(s+1)}, \rho^{(s)}) \cdot \pi(\rho^{(s)})}. \end{split}` $$ ] ] -- 3. Sample `\(u \sim U(0,1)\)` independently and set .block[ .small[ `\begin{eqnarray*} \rho^{(s+1)} = \left\{ \begin{array}{ll} \rho^\star & \quad \text{if} \quad u < r \\ \rho^{(s)} & \quad \text{if} \quad \text{otherwise} \\ \end{array} \right. . \end{eqnarray*}` ] ] -- - So, for each iteration, we first sample from the full conditionals for `\(\boldsymbol{\beta}\)` and `\(\sigma^2\)`, and then use this step to update `\(\rho\)`. --- class: center, middle # Move to the R script [here](https://sta-360-602l-su20.github.io/Course-Website/slides/IceCore.R). --- class: center, middle # What's next? ### Move on to the readings for the next module!