Conditional Autoregressive Models

Tags :: Spatial Statistics

Given a random process \[ \textbf{Y} = (Y(s_1), \dots, Y(s_n)) \] with a joint probability distribution \(p(Y(s_1), \dots, Y(s_n))\) and the spatial association matrix given by the covariance matrix \(var(\textbf{Y})\)

The spatial dependence can be specified conditionally. I.e. the conditional distribution is specified at each \(i\) given all other sites where \[ p(Y(s_i)|\{Y(s_j):j\neq i\}), i \in \mathbb{Z}^+ \] letting \(\textbf{Y}_{-i} \equiv \{Y(s_j):j\neq i\}\) we rewrite the conditional distribution as \[ p(Y(s_i)|\textbf{Y}_{-i}) \]

Given the case where we only want to consider the subset \(\mathcal{N}_i\) containing only neighbors of site \(i\), we write the conditional distribution as \[ p(Y(s_i)|\{Y(s_j):j \in \mathcal{N}_i \}), i \in \mathbb{Z}^+ \]

When conditioned on all locations, the notion that only neighbors are dependent is a markovian assumption. Models making this assumption are generally referred to as Markov Random Field (MRF) models. Specifiction of conditional, neighborhood specific, distributions need to be consistent such that the joint distribution is well-defined (not easy). The Hammersley-Clifford theorem addresses the specification and gives the form the joint probability function a MRF must take.

The Conditional Autoregressive model (CAR) is typically referring to the Guassian case which satisfies the theorem. Given the guassian case, the conditional mean and variance is given as \[ \text{E}(Y_(s_i)|\textbf{Y}_{-i}) = x’(s_i)\beta + \sum_{j=1}^n c_{ij}(Y(s_j) - x’(s_j)\beta) \] \[ \text{var}(Y(s_i)|\textbf{Y}_{-i}) = \sigma_i^2, \; i \in \mathbb{Z}^+ \] and where cij are the spatial dependece parameters, and \(\sigma_i^2\) are the conditional variances. Further \(c_{ij}\) must adhere to the following reqs: \(c_{ij} = 0\) if \(s_j \not\in \mathcal{N}_i\), \(c_{ii} = 0\), and \(c_{ij}\sigma_j^2 = c_{ji}\sigma_i^2\).

In english, we can read the conditional mean as the process at site \(i\) (\(Y(s_i)\)) conditioned on the all other sites (\(\mathbf{Y_{-i}}\)) is equal to the covariates at site \(i\) times the the change in mean response \(\beta\) plus error calculated from the spatial dependence on other sites.

Following the conditions of \(c_{ij}\), the implied joint distribution of \(\textbf{Y}\) is \[ \textbf{Y} \sim N(\textbf{X}\beta, (\textbf{I} - \textbf{C})^{-1}\Sigma) \] where \(\textbf{C}\) is an \(n \times n\) matrix and \(\textbf{C} = c_{ij}\), and \[ \Sigma = \begin{pmatrix} \sigma_1^2 & & 0\\ & \ddots & \\ 0 & & \sigma_n^2 \end{pmatrix} \] Further, \(\textbf{I} - \textbf{C}\) is invertible of the constrains on \(c\) hold.

Since there are often too many parameters in \(\textbf{C}\) to estimate, the model can be parameterized in terms of just a few parameters. E.g. a single parameter of various directions. More often though \(\textbf{C}\) will be constructed as a single parameter that takes into account dependence and scales a user defined neighborhood (proximity) matrix \(\textbf{W}\) where \(\textbf{W} = w_{ij}\) and \(w_{ij}\) is defined as \[ w_{ij} = \begin{cases} 1 & \text{if } s_i \text{ shares a common border with } s_j \\ 0 & i = j \\ 0 & \text{otherwise} \end{cases} \] thus \(\textbf{C} = p_c \textbf{W}\) and perhaps \(\Sigma = \sigma^2 \textbf{I}\).

Maximum Likelihood Estimator is commonly used when obtaining parameter estimates for CAR models when a frequentist approach is taken. However, a key advantage of the conditional formulation is that is highly conducive to Bayesian Inference. This is especially the case for non-Guassian models where non-Guassian spatial data can be conditioned on a latent CAR process, as the conditional nature of the CAR process allows it to be easily accomodated by Bayesian hierarchical model inference.

References

[1] Cressie, Noel and Lele, Subhash, New models for Markov random fields, Cambridge University Press (CUP), 1992.

[2] Lu, Haolan and Reilly, Cavan S. and Banerjee, Sudipto and Carlin, Bradley P., Bayesian areal wombling via adjacency modeling, Springer Science and Business Media LLC, 2007.

[3] Schmidt, Alexandra M. and Nobre, Widemberg S., Conditional Autoregressive (<scp>CAR</scp>) Model, Wiley, 2018.

  • ST 433/533 Applied Spatial Statistics, NCSU

Links to this note