Beside the basic latent models and likelihoods, INLA provide several ``tools'' to manipulate models and likelihoods. This page describe these. Copying a modelVery often we encounter the situation where an element of a model is needed more than once for each observation. One example is wherey = a + b*w + ... (*) for fixed weights 'w' and where (a, b) are bivariate Normals, where (a_i, b_i) is bivariate Normal and all 2-vectors are independent. Using the model f(idx, model="iid2d", n=2*m, ...) provide a random vector `v', say, with length 2*m stored as v = (a_1, a_2, ..., a_m, b_1, b_2, ...., b_m). To implement the model (*), we simply create an indentical copy of v, v*, where v == v* (nearly). Using the copy-feature, we can spesify the model (*) as idx = 1:m idx.copy = m + 1:m formula = y ~ f(idx, model="iid2d", n=2*m) + f(idx.copy, w, copy="idx") + .... recalling that the first `m' elements is `a' and the last `m' elements are `b', and where 'w' are the weights. The second `f()' terms define itself as a copy of `f(idx, ...)', and it inherit some of its features:
n=1000 i=1:n j = i z = rnorm(n) w = runif(n) y = z + 2*z*w + rnorm(n) formula = y ~ f(i, model="iid",initial=0, fixed=T) + f(j, w, copy="i", fixed=FALSE) r = inla(formula, data = data.frame(i,j,w,y)) If the scaling parameter is within given range, then option `range = c(low, high)', can be given. In this case `beta = low + (high-low)*exp(beta.local)/(1+exp(beta.local))', and the prior is defined on beta.local. If low=high or range = NULL, then the identity mapping is used. If high=Inf and low!=Inf, then the mapping `low + exp(beta.local)' is used. The case low=Inf and high!=Inf is not yet implemented. A model or a copied model can be copied several times. The degree of closeness of 'v' and 'v*' is specified by the argument precision, as the precision of the noise added to 'v' to get 'v*'. Replicate a modelIndependent replications of a model with the same hyperparmeters can be defined using the argument 'replicate',f(idx, model = .., replicate = r) Here, 'r' is a vector of the same length as 'idx'. In this case, we use a two-dimensional index to index this (sub-)model: (idx, r), so (1,2) identify the first value of the second replication of the model. Number of replications are defined as max(replicate), unless it is defined by the argument 'nrep'. One example is the model `iid': i = 1:n formula = y ~ f(i, model = "iid") + ... which has an alternative equivalent formulation as `n' replications of an iid-model with length 1 i = rep(1,n) r = 1:n formula = y ~ f(i, model="iid", replicate = r) + ... In the following example, we estimate the parameters in three AR(1) time-series with the same hyperparameters (ie its replicated) but with separate means: n = 100 y1 = arima.sim(n=n, model=list(ar=c(0.9)))+10 y2 = arima.sim(n=n, model=list(ar=c(0.9)))+20 y3 = arima.sim(n=n, model=list(ar=c(0.9)))+30 formula = y ~ mean -1 + f(i, model="ar1", replicate=r) y = c(y1,y2,y3) i = rep(1:n, 3) r = rep(1:3, each=n) mean = as.factor(r) result = inla(formula, family = "gaussian", data = data.frame(y, i, mean), control.data = list(initial = 12, fixed=TRUE)) All other arguments is interpreted for the basic model and also replicated. Like argument constr=TRUE, is interpreted as each replication sums to zero, and additional constraints are also replicated. Models with more than one type of likelihoodThere is no constraint in INLA that the type of likelihood must be the same for all observations. In fact, every observation could have its own likelihood. Extentions include more than one familily, like the Normal, Poisson, etc, but also having in the model groups of observations with separate hyperparameters within each group, where the family, for example, can be the same.In the formula y ~ a + 1 we allow 'y' to be a matrix. In this case each column of `y' define one likelihood where
The argument 'family' is in the case where 'y' is a matrix, a list of families. The argument 'control.data' is then a list of lists; one for each family. OOPS: It is difficult to check that all arguments are internally consistent in this case, therefore strange 'parsing errors' can occur if these arguments are not internally consistent. To illustrate the ideas, let us do some examples. There are 'real' examples in the case-studies section. (Note that for survival models then the responce 'y' must be a list instead of a matrix, see this example for details.) The first example, is a simple linear regression, where the first half of the data is observed with unknown precision tau.1 (with a 'default' noninformative prior) and the second half of the data is observed with unknown precision tau.2. Otherwise, the two models have the same form for the linear predictor. ## Simple linear regression with observations with two different ## variances. n = 100 N = 2*n y = numeric(N) x = rnorm(N) y[1:n] = 1 + x[1:n] + rnorm(n, sd = 1/sqrt(1)) y[1:n + n] = 1 + x[1:n + n] + rnorm(n, sd = 1/sqrt(2)) Y = matrix(NA, N, 2) Y[1:n, 1] = y[1:n] Y[1:n + n, 2] = y[1:n + n] formula = Y ~ x + 1 result = inla( formula, data = data.frame(Y, x), family = c("gaussian", "gaussian"), control.data = list(list(prior = "flat", param = numeric()), list())) The second artificial example shows how to use information from two sources to estimate the effect of the covariate 'x'. ## Simple example with two types of likelihoods n = 10 N = 2*n ## common covariates x = rnorm(n) ## Poisson, depends on x E1 = runif(n) y1 = rpois(n, lambda = E1*exp(x)) ## Binomial, depends on x size = sample(1:10, size=n, replace=TRUE) prob = exp(x)/(1+exp(x)) y2 = rbinom(n, size= size, prob = prob) ## Join them together Y = matrix(NA, N, 2) Y[1:n, 1] = y1 Y[1:n + n, 2] = y2 ## The E for the Poisson E = numeric(N) E[1:n] = E1 E[1:n + n] = NA ## Ntrials for the Binomial Ntrials = numeric(N) Ntrials[1:n] = NA Ntrials[1:n + n] = size ## Duplicate the covariate which is shared X = numeric(N) X[1:n] = x X[1:n + n] = x ## Formula involving Y as a matrix formula = Y ~ X - 1 ## `family' is now result = inla(formula, family = c("poisson", "binomial"), data = data.frame(Y, X), E = E, Ntrials = Ntrials) If the covariate 'x' is different for the two families, 'x' and 'xx', say, then we only need to make the following changes X = numeric(N) X[1:n] = x X[1:n + n] = NA XX = numeric(N) XX[1:n] = NA XX[1:n + n] = xx formula = Y ~ X + XX -1 and add 'XX' into the data.frame. Again this example is artificial as there is no gain of doing a simultanous analysis. However, it shows how the model has to be modified into a 'union' of models with the use of NA's to remove terms. In the third example, we use also the 'replicate' feature to estimate three replicated AR(1) models with the same hyperparamters, each observed differently. ## An example with three independent AR(1)'s with separate means, but ## with the same hyperparameters. These are observed with three ## different likelihoods. n = 100 x1 = arima.sim(n=n, model=list(ar=c(0.9))) + 0 x2 = arima.sim(n=n, model=list(ar=c(0.9))) + 1 x3 = arima.sim(n=n, model=list(ar=c(0.9))) + 2 ## Binomial observations Nt = 10 + rpois(n,lambda=1) y1 = rbinom(n, size=Nt, prob = exp(x1)/(1+exp(x1))) ## Poisson observations Ep = runif(n, min=1, max=10) y2 = rpois(n, lambda = Ep*exp(x2)) ## Gaussian observations y3 = rnorm(n, mean=x3, sd=0.1) ## stack these in a 3-column matrix with NA's where not observed y = matrix(NA, 3*n, 3) y[1:n, 1] = y1 y[n + 1:n, 2] = y2 y[2*n + 1:n, 3] = y3 ## define the model r = c(rep(1,n), rep(2,n), rep(3,n)) rf = as.factor(r) i = rep(1:n, 3) formula = y ~ f(i, model="ar1", replicate=r, constr=TRUE) + rf -1 data = data.frame(y, i, r, rf) ## parameters for the binomial and the poisson Ntrial = rep(NA, 3*n) Ntrial[1:n] = Nt E = rep(NA, 3*n) E[1:n + n] = Ep result = inla(formula, family = c("binomial", "poisson", "normal"), data = data, Ntrial = Ntrial, E = E, control.data = list( list(), list(), list(initial=0))) Models where the response/data depends on linear combinations of the 'linear predictor'In some cases, the data/response might depend on a linear combination of the 'linear predictor' defined in the formula, likey ~ 1 + z then this implies that 'y[1]' depends on 'intercept + beta*z[1]'. Suppose if 'y[1]' depends on '2*intercept + beta*z[1] + beta*z[2]' ? Although it is possible to express this, using the tools we already have, it is more convenient to add another layer into the model. Let 'A' be a 'm x n' matrix, which defines new linear predictors, 'eta~' from 'eta', likeeta~ = A eta Here, 'eta' is the 'ordinary' linear predictors defined using the formula, and the data depends on the linear predictors 'eta~'. So we might express this asy ~ 1 + z, A meaning in short, thaty ~ eta~ -1 eta~ = A eta eta = 1 + beta * z The following example should provide more insight. You may change 'n' and 'm', such that 'm < n', 'm=n' or 'm > n'. Note that since the response has dimension 'm' and the covariates dimension 'n', we need to use 'list(y=y, z=z)' and not a data.frame(). ## 'n' is the dimension of the linear predictor defined in the formula ## below. n = 30 ## 'm' is the number of observations of eta~, where eta~ = A eta, and ## A is a fixed m x n matrix. m = 100 ## an offset, just for testing. OOPS: the offset are defined for ## 'eta~' and NOT 'eta'!!! myoffset = 10+runif(m) ## the covariate z = runif(n) ## the linear predictor eta eta = 1 + z ## define it either as matrix, or using sparseMatrix() (in package ## Matrix) A = matrix(runif(n*m), m, n); A = as(A, "dgTMatrix") ## just to test... Eta = A %*% eta + myoffset s = 0.0001 ## The observations Y can be a sparseMatrix, but there is no need ## to. However, Y will be in this case, as it enherits its properties ## from Eta. Y = Eta + rnorm(m, sd=s) r = inla(Y ~ 1+z, ## The A-matrix defined here control.predictor = list(A=A, compute=TRUE, precision = 1e6), ## OOPS: we need to use a list() as the different lengths of Y ## and z data = list(Y=Y, z=z), ## offsets better defined here as in the formula gives the ## wrong impression: A formula like y ~ 1 + offset(off) ## suggests that the offset is in 'eta', but its in 'eta~'. Be ## aware. offset = myoffset, control.data = list(initial = log(1/s^2), fixed=TRUE)) ## Check that the linear predictors are OK, this should be a straight ## line. par(mfrow=c(2, 1)) plot(cbind(as.numeric(Eta), r$summary.linear.predictor[1:m,"mean"])) plot(cbind(as.numeric(eta), r$summary.linear.predictor[m+1:n,"mean"])) n = 100L s = c(-1, 0, 1) nS = length(s) j = sample(1L:nS, n, replace=TRUE) k = j k[j == 1L] = 2 k[j == 2L] = 3 k[k == 3L] = 1 noise = rnorm(n, sd=0.0001) y = 1 + s[j] + 0.5*s[k] + noise ## build the formula such that the linear predictor is the intercept ## (index 1) and the 's' term (index 2:(n+1)). then kind of ## 'construct' the model using the A-matrix. formula = y ~ -1 + intercept + f(idx) A = matrix(0, n, nS+1L) for(i in 1L:n) { A[i, 1L] = 1 A[i, 1L + j[i]] = 1 A[i, 1L + k[i]] = 0.5 } data = list(intercept = c(1, rep(NA, nS)), idx = c(NA, 1L:nS)) result = inla(formula, data=data, control.predictor=list(A=A)) ## should be a straight line plot(result$summary.random$idx$mean, s) |