# Conditional Autoregressive Models

Tags :: Spatial Statistics

Given a random process \[ \textbf{Y} = (Y(s_1), \dots, Y(s_n)) \] with a joint probability distribution \(p(Y(s_1), \dots, Y(s_n))\) and the spatial association matrix given by the covariance matrix \(var(\textbf{Y})\)

The spatial dependence can be specified **conditionally**. I.e. the conditional distribution is specified at each \(i\) given *all* other sites where
\[
p(Y(s_i)|\{Y(s_j):j\neq i\}), i \in \mathbb{Z}^+
\]
letting \(\textbf{Y}_{-i} \equiv \{Y(s_j):j\neq i\}\) we rewrite the conditional distribution as
\[
p(Y(s_i)|\textbf{Y}_{-i})
\]

Given the case where we only want to consider the subset \(\mathcal{N}_i\) containing only neighbors of site \(i\), we write the conditional distribution as \[ p(Y(s_i)|\{Y(s_j):j \in \mathcal{N}_i \}), i \in \mathbb{Z}^+ \]

When conditioned on all locations, the notion that only *neighbors* are dependent is a **markovian assumption**. Models making this assumption are generally referred to as **Markov Random Field (MRF)** models. Specifiction of conditional, neighborhood specific, distributions need to be consistent such that the joint distribution is well-defined (not easy). The Hammersley-Clifford theorem addresses the specification and gives the form the joint probability function a MRF must take.

The **Conditional Autoregressive model (CAR)** is typically referring to the Guassian case which satisfies the theorem. Given the guassian case, the conditional mean and variance is given as
\[
\text{E}(Y_(s_i)|\textbf{Y}_{-i}) = x’(s_i)\beta + \sum_{j=1}^n c_{ij}(Y(s_j) - x’(s_j)\beta)
\]
\[
\text{var}(Y(s_i)|\textbf{Y}_{-i}) = \sigma_i^2, \; i \in \mathbb{Z}^+
\]
and where cij are the spatial dependece parameters, and \(\sigma_i^2\) are the conditional variances. Further \(c_{ij}\) must adhere to the following reqs: \(c_{ij} = 0\) if \(s_j \not\in \mathcal{N}_i\), \(c_{ii} = 0\), and \(c_{ij}\sigma_j^2 = c_{ji}\sigma_i^2\).

In english, we can read the conditional mean as the process at site \(i\) (\(Y(s_i)\)) conditioned on the all other sites (\(\mathbf{Y_{-i}}\)) is equal to the covariates at site \(i\) times the the change in mean response \(\beta\) plus error calculated from the spatial dependence on other sites.

Following the conditions of \(c_{ij}\), the implied joint distribution of \(\textbf{Y}\) is \[ \textbf{Y} \sim N(\textbf{X}\beta, (\textbf{I} - \textbf{C})^{-1}\Sigma) \] where \(\textbf{C}\) is an \(n \times n\) matrix and \(\textbf{C} = c_{ij}\), and \[ \Sigma = \begin{pmatrix} \sigma_1^2 & & 0\\ & \ddots & \\ 0 & & \sigma_n^2 \end{pmatrix} \] Further, \(\textbf{I} - \textbf{C}\) is invertible of the constrains on \(c\) hold.

Since there are often too many parameters in \(\textbf{C}\) to estimate, the model can be parameterized in terms of just a few parameters. E.g. a single parameter of various directions. More often though \(\textbf{C}\) will be constructed as a *single* parameter that takes into account dependence and scales a user defined neighborhood (proximity) matrix \(\textbf{W}\) where \(\textbf{W} = w_{ij}\) and \(w_{ij}\) is defined as
\[
w_{ij} = \begin{cases}
1 & \text{if } s_i \text{ shares a common border with } s_j \\
0 & i = j \\
0 & \text{otherwise}
\end{cases}
\]
thus \(\textbf{C} = p_c \textbf{W}\) and perhaps \(\Sigma = \sigma^2 \textbf{I}\).

Maximum Likelihood Estimator is commonly used when obtaining parameter estimates for CAR models when a frequentist approach is taken. However, a key advantage of the conditional formulation is that is highly conducive to Bayesian Inference. This is especially the case for non-Guassian models where non-Guassian spatial data can be conditioned on a latent CAR process, as the conditional nature of the CAR process allows it to be easily accomodated by Bayesian hierarchical model inference.

## References

[1] Cressie, Noel and Lele, Subhash, *New models for Markov random fields*, Cambridge University Press (CUP), 1992.

[2] Lu, Haolan and Reilly, Cavan S. and Banerjee, Sudipto and Carlin, Bradley P., *Bayesian areal wombling via adjacency modeling*, Springer Science and Business Media LLC, 2007.

[3] Schmidt, Alexandra M. and Nobre, Widemberg S., *Conditional Autoregressive (<scp>CAR</scp>) Model*, Wiley, 2018.

- ST 433/533 Applied Spatial Statistics, NCSU