FAQ
Simply you can use browseVignettes(package=“INLA”) in R to get the FAQs.
1. I have used R-INLA in scientific work to be published; how should i cite it?
It would be great if the report describing the methodology could be cited, which is
H. Rue, S. Martino, and N. Chopin. Approximate Bayesian inference for latent Gaussian models using inte-
grated nested Laplace approximations (with discussion). Journal of the Royal Statistical Society, Series B,
71(2):319{392, 2009.
and mention also the web-cite for where the R-INLA package is located, www.r-inla.org, The new features in the packages, plus some developments since the JRSSB-paper, is reported here:
Bayesian computing with INLA: new features
Thiago G. Martins, Daniel Simpson, Finn Lindgren & Håvard Rue
To appear in CSDA (Computational Statistics and Data Analysis), 2012.
If you use the spatial SPDE-models, you should also cite
F. Lindgren, H. Rue, and J. Lindstrom. An explicit link between Gaussian fields and Gaussian Markov
random fields: The SPDE approach (with discussion). Journal of the Royal Statistical Society, Series B, 73
(4):423–498, 2011.
and optional these ones:
In order to make spatial statistics computationally feasible, we need to forget about the covariance function
Daniel Simpson , Finn Lindgren and Håvard Rue
Environmetrics 2012; 23: 65–74
Think continuous: Markovian Gaussian models in spatial statistics
Daniel Simpson, Finn Lindgren, and Håvard Rue
Spatial Statistics 1 (2012) 16–29
Here are a couple of pictures from the RSS-meetings, in 2008 (from left to right: L.Held, S.Richardson, H.Rue, S.Martino and N.Chopin) and 2011 (from left to right: J.Lindstrom, F.Lindgren and H.Rue).
2. I have compared the INLA-results with those obtained using my own MCMC-code, but they do not match so well. What is wrong?
Although it might happen you have run into one of those cases where INLA is know to fail, it is far more likely it is "the devil is in the details'-effect. It is surprisingly difficult to get it all correct so that the model, in INLA and in your MCMC code is exactly the same. If there is one tiny difference it will show up in the results, somewhere. This 'difference' can be using not the same parameters of the prior, using a conditional not a marginal 'initial condition' for the AR(1) model, etc. Especially, this is awkward for intrinsic models where there are sum-to-zero constraints.
If you run into these problems, please contact us to get the implied model in INLA specified in detail. In this way, you are comparing the results obtained from the same model. For smaller models, INLA provide its own MCMC-sampler
which you can use; see demo("Tokyo-compare").
3. I want to use a prior for my hyperparameter that seems not to be implemented, what should I do?
If you want to use a prior for the hyperparameter that is not yet implemented there are two choices. If you think that your prior should be on the list and that other might use it to, please send us the expressions and we will add it. Alternatively, you can define your own prior using prior = "expression: ...." (see this help-page), or by specifiying a table of x and y values which define the prior distribution.
Thus, in total there are three ways to specify prior distributions for hyperparameters in INLA:
1. Use an available prior distribution, see Priors
2. Define your own prior distribution function using R-like (not equal) syntax as expression.
3. Assign a table of x and corresponding y values which represent your prior distribution.
In the following we will provide more details regarding 2.) and 3.). Finally, we will present an example illustrating (and comparing) the three different possibilities by means of the log-gamma distribution for the precision parameter.
A user can specify any (univariate) prior distribution for the hyperparameter theta by defining an expression for the log-density log .p(theta.), as a function of the corresponding theta.. It is important to be aware that .theta is defined on the internal scale. The format is
expression: <statement>; <statement>; ...; return(<value>)
where “<statement>” is any regular statement (more below) and “<value>” is the value for the log-density of the prior, evaluated at the current value for .theta. Here, is an example defining the log-gamma distribution:
prior.expression = "expression:
a = 1;
b = 0.1;
precision = exp(log_precision);
logdens = log(b^a) - lgamma(a)
+ (a-1)*log_precision - b*precision;
log_jacobian = log_precision;
return(logdens + log_jacobian);"
Some syntax specific notes:
1. return (x) (with a space before “(.)”) is NOT allowed, it must be return(x).
2. A “;” is needed to terminate each expression.
3. You can use “_” in variable-names, like log_precision = <whatever>; see the following example.
Known functions that can be used within the expression statement are:
- common math functions, such as exp(x),sin(x),. . ..
- gamma(x) denotes the gamma-function and lgamma(x) is its log (see ?gamma in R).
- x^y is expressed as either x^y or pow(x;y)
Instead of defining a prior distribution function, it is possible to provide a table of suitable support values x (internal scale) and the corresponding log-density values y. INLA fits a spline through the provided points and continues with this in the succeeding computations. Note, there is no transformation into a functional form performed or required. The input-format for the table is a string, which starts with table: and is then followed by a block of x-values and a block of the corresponding y-values, which represent the values of the log-density evaluated on x. Thus
table: x_1 ... x_n y_1 ... y_n
We illustrate all three different ways of defining a prior distribution for the residual precision of a normal likelihood. To show that the three definitions lead to the same result we inspect the logmarginal likelihood.
## the loggamma-prior
prior.function = function(log_precision) {
a = 1;
b = 0.1;
precision = exp(log_precision);
logdens = log(b^a) - lgamma(a) + (a-1)*log_precision - b*precision;
log_jacobian = log_precision;
return(logdens + log_jacobian)
}
## implementing the loggamma-prior using "expression:"
prior.expression = "expression:
a = 1;
b = 0.1;
precision = exp(log_precision);
logdens = log(b^a) - lgamma(a)
+ (a-1)*log_precision - b*precision;
log_jacobian = log_precision;
return(logdens + log_jacobian);"
## use suitable support points x
lprec = seq(-10, 10, len=100)
## link the x and corresponding y values into a string which begins with "table:""
prior.table = inla.paste(c("table:", cbind(lprec, prior.function(lprec))))
# simulate some data
n = 50
y = rnorm(n)
## 1. use the built-in loggamma prior
r1 = inla(y~1,data = data.frame(y),
control.family = list(hyper = list(prec = list(prior = "loggamma", param = c(1, 0.1)))))
## 2. use the definition using expression
r2 = inla(y~1, data = data.frame(y), control.family = list(hyper = list(prec = list(prior = prior.expression))))
## 3. use a table of x and y values representing the loggamma prior
r3 = inla(y~1, data = data.frame(y), control.family = list(hyper = list(prec = list(prior = prior.table))))
## verify that we get the same result using the log-marginal likelihood
c(r1$mlik[1], r2$mlik[1], r3$mlik[1])
#[1] -77.09861 -77.09861 -77.09896
4. Does INLA support the use of different link-functions?
Yes, the type of link function is given in the 'control.family' statement using `link="<link>"', and the type of link-functions implemented are listed on the documentation for each likelihood. The default option `link="default"' is a macro for the second entry on the list. Here is an example using the binomial.
n = 100
z = runif(n)
eta = 1 + 0.1*z
N = 20
## logit
p = exp(eta)/(1+exp(eta))
y = rbinom(n, size = N, prob = p)
r = inla(y ~ 1 + z, data = data.frame(y, z), family = "binomial", Ntrials = rep(N, n),
control.family = list(link = "logit"), control.predictor = list(compute=TRUE))
## probit
p = pnorm(eta)
y = rbinom(n, size = N, prob = p)
rr = inla(y ~ 1 + z, data = data.frame(y, z), family = "binomial", Ntrials = rep(N, n),
control.family = list(link = "probit"), control.predictor = list(compute=TRUE))
## cloglog
p = 1-exp(-exp(eta))
y = rbinom(n, size = N, prob = p)
rrr = inla(y ~ 1 + z, data = data.frame(y, z), family = "binomial", Ntrials = rep(N, n),
control.family = list(link = "cloglog"), control.predictor = list(compute=TRUE))
If you miss other link-functions, or want to enable more of them for likelihoods where they are not yet enabled; please let us know.
5. How can I do predictions using INLA?
In INLA there is no function ``predict'' as for glm/lm in R. Predictions must to done as a part of the model fitting itself. As prediction is the same as fitting a model with some missing data, we can simply set y[i] = NA for those ``locations'' we want to predict. Here is a simple example
n = 100
n.pred = 10
y = arima.sim(n=n, model=list(ar=0.9))
N = n + n.pred
yy = c(y, rep(NA, n.pred))
i = 1:N
formula = yy ~ f(i, model="ar1")
r = inla(formula, data = data.frame(i,yy),
control.family = list(initial = 10, fixed=TRUE)) ## no observational noise
which gives predictions
> r$summary.random$i[(n+1):N, c("mean", "sd") ]
mean sd
101 -2.6021569048 1.397605790
102 -2.2801830807 1.639556612
103 -2.0008859575 1.807935005
104 -1.7635334748 1.932743951
105 -1.5569410367 2.025845322
106 -1.3790793264 2.097800368
107 -1.2261449461 2.153969553
108 -1.0913806143 2.197121102
109 -0.9767910767 2.232332803
110 -0.8749748925 2.259614345
Quantiles such like r$summary.fitted.values and r$marginals.fitted.values, if computed, use the identity link if y[i] = NA by default. If you want the fitted.values computed with a different link function, then there are two ways to doit.
Automatically. In the case you want to use the link-function from the likelihood already used (most often the case), there is the argument 'link' in 'control.predictor'. If the response y[idx] = NA, then set link[idx] = 1, to indicate that you want to compute that fitted value using the link function from 'family[1]'. With several likelihoods, set link[idx] to family-index which is correct, ie the column number in the response. The following example shows the usage:
## simple poisson regression
n = 100
x = runif(n)
eta = 1 + x
lambda = exp(eta)
y = rpois(n, lambda = lambda)
## missing values:
y[1:3] = NA
y[(n-2):n] = NA
## link = 1 is a shortcut for rep(1, <n>) where <n> is the appropriate
## length. here '1' is a reference to the first 'family', ie
## 'family[1]'
r = inla(y ~ 1 + x, family = "poisson",
data = data.frame(y, x),
control.predictor = list(link = 1))
plot(r$summary.fitted.values$mean)
## this is the formally correct way, defining 'link' only where there
## are missing values. entries in 'link' for which the observation is
## not NA, is currently ignored.
link = rep(NA, n)
link[which(is.na(y))] = 1
r = inla(y ~ 1 + x, family = "poisson",
data = data.frame(y, x),
control.predictor = list(link = link))
plot(r$summary.fitted.values$mean)
## for more than one likelihood, use '2' to refer to the second
## likelihood. here we just split the data in two, and assign the
## second half the nbinomial distribution.
n2 = n %/% 2L
Y = matrix(NA, n, 2)
Y[1:n2, 1] = y[1:n2]
Y[1:n2 + n2, 2] = y[1:n2 + n2]
link = rep(NA, n)
link[which(is.na(y[1:n2]))] = 1
link[n2 + which(is.na(y[1:n2 + n2]))] = 2
r = inla(Y ~ 1 + x, family = c("poisson", "nbinomial"),
data = list(Y=Y, x=x),
control.predictor = list(link = link))
plot(r$summary.fitted.values$mean)
Note that if y[idx] is not NA, then link[idx] is NOT used (but must still be a valid value or NA).
Manually. You can transform marginals manually using the function inla.marginal.transform(), as in the following example (see demo(Tokyo))
## Load the data
data(Tokyo)
summary(Tokyo)
Tokyo$y[300:366] <- NA
## Define the model
formula = y ~ f(time, model="rw2", cyclic=TRUE, param=c(1,0.0001)) - 1
## The call to inla
result = inla(formula, family="binomial", Ntrials=n, data=Tokyo,
control.predictor=list(compute=T))
## need to recompute the fitted values for those with data[i] = NA,
## as the identity link is used.
n = 366
fitted.values.mean = numeric(n)
for(i in 1:366) {
if (is.na(Tokyo$y[i])) {
if (FALSE) {
## either like this, which is slower...
marg = inla.marginal.transform(
function(x) exp(x)/(1+exp(x)),
result$marginals.fitted.values[[i]] )
fitted.values.mean[i] = inla.expectation(
function(x) x, marg)
} else {
## or like this, which is much faster...
fitted.values.mean[i] = inla.expectation(
function(x) exp(x)/(1 +exp(x)),
result$marginals.fitted.values[[i]])
}
} else {
fitted.values.mean[i] = result$summary.fitted.values[i,"mean"]
}
}
plot(fitted.values.mean)
6. Some of the models needs a graph, how do I specify it?
PS: This issue has been rewritten due to new additions in the code starting from March 4th, 2012, and only applies to versions after this date. The changes made are backward-compatible. Argument 'graph.file' has been replaced with a more general argument 'graph'.
Some of the models in INLA needs the user to specify a graph, saying which nodes are neighbours to each other. A 'graph' can be specified in three different ways.
As an ascii or binary file with a graph specification, or the same contents given as (possible list of) mix of character and numerics arguments
As a symmetric (dense or sparse) matrix, where the non-zero pattern of the matrix defines the graph.
As an inla.graph-object
Defining the graph as an ascii or binary file with a graph specification, or the same contents given as (possible list of) mix of character and numerics arguments
The graph can be defined in an ascii-file, with the following format. The first entry is the number of nodes in the graph, `n'. The nodes in the graph are labelled 1, 2, ..., n. The next entries, specify the number of neighbours and the neighbours for each node, like `node number.of.neighbours neighbour.1 neighbour.2 ... A simple example is the following file
4
1 2 3 4
2 0
3 1 1
4 1 1
This defines a graph with four nodes, where node 1 has 2 neighbours 3 and 4. Node 2 as 0 neighbours, node 3 has 1 neighbour 1, and node 4 has 1 neighbour 1, and the graph looks like this
Note that we need to specify the implied symmetry as well. In this example 4 is a neighbour of 1, then we also need to specify that 1 is a neighbour of 4.
Instead of storing the graph specification on a file, it can also be specified as a character string with the same contents as a file, like
"4 1 2 3 4 2 0 3 1 1 4 1 1"
or any other variant like
4, "1 2 3", "4 2 0 3 1 1 4 1", 1
PS: Be aware that there may be limitations of the length of a string/line, so in practice, this way specifying the graph, seems more useful for teaching or demonstration purposes than for practical analysis.
Within R, this would look like
formula = y ~ f(idx, model = "besag", graph = "graph.dat")
or
formula = y ~ f(idx, model = "besag", graph = c(4, "1 2 3", "4 2 0 3 1 1 4 1", 1))
As a symmetric (dense or sparse) matrix, where the non-zero pattern of the matrix defines the graph
A neighbour matrix is often used for defining which nodes that are neighbours, with the convention that if Q[i,j] != 0 then i and j are neighbours (and i != j).
For example, the (dense) matrix C
[,1] [,2] [,3] [,4]
[1,] 1 0 1 1
[2,] 0 1 0 0
[3,] 1 0 1 0
[4,] 1 0 0 1
defines the same graph show above. Since graphs tends to be large, we can also specify the graph as a sparse matrix (see ?sparseMatrix), as
4 x 4 sparse Matrix of class "dgTMatrix"
[1,] 1 . 1 1
[2,] . 1 . .
[3,] 1 . 1 .
[4,] 1 . . 1
What value we set on the diagonal, does not matter as it is not used. INLA set the diagonal to 1, when such matrices are created.
If C is one of these matrices, then from within R, this would look like
formula = y ~ f(idx, model = "besag", graph = C)
Defining the graph as an inla.graph-object
Internally in INLA, there is a 'inla.graph' object which is used to represent a graph, like what we get doing
g = inla.read.graph("4 1 2 3 4 2 0 3 1 1 4 1 1")
which reads the graph from the given specification, and returns an inla.graph-object. We can then use this object with R as
formula = y ~ f(idx, model = "besag", graph = g)
as you might have guessed.
If you want to specify the inla.graph-object yourself, you need to know the internals of that object:
> str(g)
List of 5
$ n : int 4
$ nnbs : num [1:4] 2 0 1 1
$ nbs :List of 4
..$ : int [1:2] 3 4
..$ : num(0)
..$ : int 1
..$ : int 1
$ cc :List of 3
..$ id : int [1:4] 1 2 1 1
..$ n : int 2
..$ nodes:List of 2
.. ..$ : int [1:3] 1 3 4
.. ..$ : int 2
- attr(*, "class")= chr "inla.graph"
Here, 'n' is the size of the graph, 'nnbs' are the number of neighbours to each node, 'nbs' list all the neighbours to each node, and the class is 'inla.graph".
The 'cc'-list is for internal use only and specify the connected components in the graph. The way to add connected components information to your graph, is to use
g = inla.add.graph.cc(g)
where the argument is an inla.graph-object, possibly without the connected components information.
Working with graphs
INLA has some functions to work with graphs, and here is a short summary.
inla.read.graph() and inla.write.graph(), read and write graphs using any of the graph specifications above.
You can plot a inla.graph-object using plot() and get a summary using summary(). (The plotting requires the Rgraphviz library from the bioconductor project; see http://www.bioconductor.org).
From a graph specification, you can generate the neighbour matrix, using inla.graph2matrix().
You can plot a graph specification as a neighbour matrix, using inla.spy()
If you have 'errors' in your graph, you may read it using inla.debug.graph(). This is only available for a graph specification in an ascii-file.
See also the example in ?read.graph and ?graph2matrix and demo(Graph)
7. How does inla() deal with "NA"s in the data-argument?
For a formula like
formula = y ~ x + f(k, model= <some model>)
then NA's in either y, x or k are treated differently.
NA's in the response y
If y[i] = NA, this means that y[i] is not observed, hence gives no contribution to the likelihood.
NA's in fixed effect x
If x[i] = NA this means that x[i] is not part of the linear predictor for y[i]. For fixed effects, this is equivalent to x[i]=0, hence internally we make this change: x[is.na(x)] = 0
NA's in random effect k
If k[i] = NA, this means that the random effect does not contribute to the linear predictor for y[i].
NA's in a factor x
NA's in a factor x is not allowed unless NA is a level in itself, or 'control.fixed = list(expand.factor.strategy = "inla")' is set. With this option set, then NA is interpreted similarly as a fixed effect, an NA means no contribution from x.
The effect of expand.factor.strategy="inla", is best explained with an example. By default, INLA use 'model.matrix' to construct the matrix for the fixed effects, which do
> inla(y ~ 1 + x, data = data.frame(y=1:3,x=factor(c("a","b","c"))))$model.matrix
(Intercept) xb xc
1 1 0 0
2 1 1 0
3 1 0 1
for default value of the argument 'contrasts'. The effect of 'xa' is removed to make the corresponding matrix non-singular. If we want to expand 'x' into each of each three effects, then we can do
> inla(y ~ 1 + x, data = data.frame(y=1:3,x=factor(c("a","b","c"))), control.fixed = list(expand.factor.strategy="inla"))$model.matrix
(Intercept) xa xb xc
1 1 1 0 0
2 1 0 1 0
3 1 0 0 1
As we see, each level of the factor is now treated symetrically. Although the corresponding frequentist matrix is singular as we have confounding with the intercept, the Bayesian posterior is still proper with proper priors. With a NA in 'x', we get
> inla(y ~ 1 + x, data = data.frame(y=1:3,x=factor(c("a","b",NA))), control.fixed = list(expand.factor.strategy="inla"))$model.matrix
(Intercept) xa xb
1 1 1 0
2 1 0 1
3 1 0 0
so that the 3rd element of the linear predictor has no contribution from 'x', as it should.
8. Can INLA deal with missing covariates?
No, INLA has no way to `impute' or integrate-out missing covariates. You have to adjust your model to account for missing covariates. Sometimes, you can formulate a joint model for the data and the covariates, but this is case-specific.
9. How can I compute cross-validation or predictive measures of fit?
inla() provides two types of ``leave-one-out'' predictive measures of fit. It is the CPO value, which is Prob(y_i | y_{-i}), and the PIT value Prob(y_i^new \le y_i | y_{-i}). To enable the computation of these quantities, you will need to add the argument
control.compute=list(cpo=TRUE)
to `inla()'.
Since the CPO/PIT quantities are computed keeping the integration-points fixed, you may want to use an improved version of the grid integration, for example you may additionally add the argument
control.inla=list(int.strategy = "grid", diff.logdens = 4)
In some examples, you may also want to increase the accuracy of the tails of the marginals, as the tail behaviour is sometimes important for the CPO/PIT calculations (see the variable `result$cpo$failure' below), like
control.inla = list(strategy = "laplace", npoints = 21)
where add more evaluation points (npoints=21) instead of the default npoints=9.
If the resulting object is `result', then you will find the predictive quantities as `result$cpo$cpo' and `result$cpo$pit'.
Another issue, are the implicit assumptions made in the inla-program. There are internal checks in the inla-program about these assumptions, and these check will appear as `result$cpo$failure'. In short, if `result$cpo$failure[i] > 0' then some assumption is violated, the higher the value (maximum 1) the more seriously. If `result$cpo$failure[i] == 0' then the assumptions should be ok.
It might be a good idea to recompute those, if any, CPO/PIT values for which `result$cpo$failure>0'. However, this must be done `manually' by removing `y[i]' from the dataset, fit the model and try to predict `y[i]'. To provide a more efficient implementation of this, we have provided
improved.result = inla.cpo( result )
which take an inla-object which is the output from inla(), and recompute (in an efficient way) the cpo/pit for which result$cpo$failure > 0, and return 'result' with the improved estimates of cpo/pit. See ?inla.cpo for details.
WARNING: We recommend caution when using cpo/pit for validation since they may dependent on tail behavior which is difficult to estimate. It
might be a good idea to verify that the smallest cpo-values are estimated correctly by setting res$cpo$failure[smallest cpo indices] = 1
and re-estimate them with inla.cpo().
10. Even though inla() is fast, is it possible to make it run faster?
The R-INLA library contains precompiled binaries of the inla-program. To make the program run for all various processors, its is compiled using generic compiler options and 32bit, for Windows, MacOSX and Linux. There is certainly possible to compile the inla-program particulary for your computer/cpu. On my laptop, I get a speedup of about 25% easily. There are 'build' files available here, which require you to install mercurial which is availble in the package-system for all major Linux distros and for MacOSX prebuilt binaries are availble. You may want to play around with ATLAS/GOTO as well, and ATLAS is bundled in the build files. For MacOSX you'll need a newer compiler than the one bundled in XCode, for example using the ones here.
To use a native compiled inla-program, download the appropriate version and install it as <path>/inla, for example /usr/local/bin/inla. The easiest way to override the builtin inla-program in R-INLA, is to do
library(INLA)
inla.setOption("inla.call", "/usr/local/bin/inla")
and then proceed as normal. Make sure to update these natively compiled programs when R-INLA is upgraded using `make update'.
11. I have access to a remote Linux server, is it possible to run the computations remotely and running R locally?
Yes! This option allow INLA to use a remote server to do the computations. In order to use this feature, you need to do some setup which is different from (various) Linux distributions, Mac and Windows. In short the
Install the inla-program on a remote server, for example foo.bar.org, for example using one of the precompiled ones
Install your public ssh-key on foo.bar.org
In R, do ``inla.remote()'' to initialise the init-file; ``~/.inlarc''
Quit R and edit the init-file ``~/.inlarc'' and add your username at the remote server and the address for the remote server
Before starting a new R-session, you need to sign-out your key from some wallet/keychain or do some setup manually (see below). Then starting a new R-session, you can use remote computing by setting the option "inla.call" to "remote", either globally
inla.setOption("inla.call", "remote")
or per inla-call
inla(...., inla.call="remote")
You can also submit a job on the remote server, so you do not need to sit and wait for it to finish, but you can collect the results later. Basically, you do
r = inla(..., inla.call = "submit")
which will start the job on the server. You can start many jobs, and list them using
inla.qstat()
and you can fetch the results (for the job above) using
r = inla.qget(r)
You can also delete jobs and fetch the jobs from another machine; see ?inla.q for further details.
HOWTO setup ssh-keys: For the unexperienced user, this is somewhat tricky; sorry about that. Any Google search on "How to setup ssh-keys" willl reveal that more than one found this a bit tricky. Well, on some Linux distributions there are now tools which do this automatically (in Ubuntu at least), but a manual setup is not hard as ssh-agent is used by default when you log into your Desktop in Linux (Make sure you have installed `ssh' packages; do a search and install all of them if you're not sure what is the least requriement.) For MacOSX, the easiness depends on your version of the OS, but for MacOSX version prior to 10.6 this involves installing the program SSHKeychain (but make sure to consult this page to get some hints how to activate the ssh-agent at login for older MacOSX's , or alternatively, you may also use the same approach as for Windows using a fixed socket). For the newer MaxOSX version 10.6, this feature is builtin.
Here is how to setup everything manually. If you have already generated your key or use a separate key for various machines, please make appropriate changes. I assume the remote Linux/MacOSX host, which shall do the computations, is called `mc.math.ntnu.no' and the remote username is `hrue'. You have to change these accordingly.
LINUX/MacOSX:
First time only: (all the following commands you'll do in the terminal window)
Linux: Open ``seahorse'' (In Fedora, this is in the menu `Accessories->Passwords and Encryption keys'). Open `New' and `SSH Key' and follow the instructions...
Do this manually:
Generate your ssh-key by running the command `ssh-keygen'. Use defaults and add your passwd twice (or just leave the password to do a password-free approach).
You now need to copy the public part of the key to the remote host using the script `ssh-copy-id'. Try to run `ssh-copy-id', if it fail, then start R, and do the command `inla.ssh.copy.id()', which gives the path to a copy of that script in the INLA-package which you have to use instead. Well, do `ssh-copy-id hrue@mc.math.ntnu.no'. You'll be asked for some login password on the remote machine. You may have to add the option `-i ~/.ssh/id_rsa.pub', so do `ssh-copy-id -i ~/.ssh/id_rsa.pub hrue@mc.math.ntnu.no'
MacOSX 10.6: Log out and in again. If you have protected your key using a password, then open a terminal and do `ssh-add', and window pop up, and select all of it (I think, at least the one, save password to the keychain). At the next login the key is already in the keychain and you don't have to do anything.
Every time you start your desktop:
Modern Linux distros has keychains activated and this is probably already ok. The same is for MacOSX 10.6.
Older Linux/MacOSX user some kind of wallet/keychain/SSHKeychain, and this step depends on your setup. Easiest is to contact your local expert; as it should be easy...
Windows:
First time only:
Install CYGWIN. It is easiest to use the default location C:/cygwin. Make sure to install also the `ssh' packages; just search for them and add all.
You should now have a Cygwin-icon on the Desktop or find the `Cygwin ->Cygwin Bash shell' entry in the 'All Programs' entry. Start it, and a terminal will show up. In that terminal do `ssh-keygen' and use the defaults and add your password twice.
Start R, and run the command `inla.ssh.copy.id()'. Take that path to the ssh-copy-id script and run it adding the username and address at/for the remote host (to copy the public part of your key), like
/cygdrive/c/PROGRA~1/R/R-29~1.1/library/INLA/bin/remote/ssh-copy-id hrue@inla.math.ntnu.no
Every time you start your desktop:
Start the Cygwin shell using the Cygwin-icon on your desktop. Then run the command
ssh-agent -a/tmp/ssh-auth-sock-hrue bash
replacing `hrue' with your username. Then run the command `ssh-add' and give your password.
You R-session should now be able to connect to the remote server.
If you're not using default choices...: (replace `hrue' with your username)
If you do not install Cygwin at `C:/cygwin' the set the path using `inla.setOption("cygwin", NEWPATH)'
If your $HOME in Cygwin is not `/home/hrue', check doing `echo $HOME' in the Cygwin-shell, set the home directory using `inla.setOption("cygwin.home", NEWHOME)'
If you ssh-agent bind-address is not `/tmp/ssh-auth-sock-hrue', then set it using `inla.setOption("ssh.auth.sock", NEWSSHAUTHSOCK)'.
12. I want to compare the results from inla() with those obtained with MCMC; is there any way to do this using inla()?
Even though inla() use the integrated nested Laplace approximations to compute the marginals, it has also a builtin MCMC-sampler. The sampler is the ``one-block MCMC sampler'' described in chapter 4 in the book by Rue and Held (2005). There is only one type of sampler, and if the model is either to large or to far away from a conditional Gaussian one, the accept-rate can be to low to get some reliable results out. In this case you may want to try to reduce the scaling of the proposal (option -S <scale>) but this does not always help and you have to find other ways to sample from your model.
In the demo-directory of INLA, there is an example how to compare the inla-results for the Tokyo-example using the builtin sampler, and you can run it using
demo("Tokyo-compare")
and the R-code is found as the output of the following command
system.file("demo/Tokyo-compare.R", package="INLA")
When running the example, you'll enter a menu displaying
Chose directory to view
[1] hyperparameter-random.effect00000001-parameter
[2] hyperparameter-random.effect00000001-parameter-user-scale
[3] random.effect00000001
(<0: histogram, >0: density: q:quit) :
These are the (set of) nodes that is available, and entering for example `1' displays the density estimates from MCMC and inla(), and `-1' display the density estimate from inla() together with the histogram of the MCMC trace.
13. I want to fit survival models using INLA, how do I do this?
Easiest way is, first to check the given examples. OR follow these steps:
Use "inla.surv" function to get a data frame appropriate for your model. A description of it is found either as ?inla.surv in R or in this description.
Define "formula", a symbolic description of the model to be fit.
Finally, call "inla()" function, specifying the likelihood family and additional parameters.
For more survival examples, see the appropriate reports in the paper-section.
14. I have a complicated ``state-space model'', can this be fitted using INLA?
Good question! There is no `state-space' model in the list of latent models, it does not mean that you cannot define your own latent model in a state-space form. It will be more complicated though, at least until some of these model are properly implemented and included in INLA. It would be sufficient to give an example only, then you'll figure out (if you're able to do it)...
## This is an example how to implement the following state-space model
## without using the builtin model for {x_t}
##
## Observational equation:
## y_t = x_t + eps_t, t=1..n
##
## State equation:
## x_t = 2*x_{t-1} - x_{t-2} + v_t, t=3...n
##
## The 'trick' is to rewrite this as
## y_t = x_t + eps_t, t=1..n
## 0 = x_t - 2*x_{t-1} + x_{t-2} + v_t, t=3...n
## and then we can implement this using an augmented model two
## different likelihoods. The first n datapoints are Gaussian with
## unknown precision, whereas the last `m' datapoints, which are faked
## to be 0, is observed with a high and fixed precision. Note that the
## model for x_t, we just use model=iid with fixed low precision.
n = 100
m = n-2
y = sin((1:n)*0.2) + rnorm(n, sd=0.1)
formula = Y ~ f(i, model="iid", initial=-10, fixed=T) +
f(j, w, copy="i") +
f(k, copy="i") +
f(l, model ="iid") -1
Y = matrix(NA, n+m, 2)
Y[1:n, 1] = y
Y[1:m + n, 2] = 0
i = c(1:n, 3:n) # x_t
j = c(rep(NA,n), 3:n -1) # x_t-1
w = c(rep(NA,n), rep(-2,m)) # weights for j
k = c(rep(NA,n), 3:n -2) # x_t-2
l = c(rep(NA,n), 1:m) # v_t
r = inla(formula, data = data.frame(i,j,w,k,l,Y),
family = rep("gaussian", 2),
control.family = list(list(), list(initial=10, fixed=T)))
15. I have multinomial data, but I cannot find the multinomial likelihood; isn't it available?
Multinomial data implies that one datapoint depends on the latent field through a non-linear combination of some of its elements, and this situation is not yet supported in INLA. The classical way around it, can however, be used. It is possible to rewrite multinomial data into several Poisson data, using a data-indexed intercept. A classical reference is found here, and a Bayesian discussion is found in section 9.7 of the WinBugs manual. Here is the Alligator data example of multinomial analysis taken form the WinBugs manual vol I. Here is another, somewhat easier, example using simulated data:
## A simple example of the multinomial --> poisson trick
##
## I know: this is not the way to write efficient code in R, but it
## makes it much more clear of what is going on.
n = 2000
m = 3
size = 5
sum.to.zero = TRUE
y = c()
idx = c()
jdx = c()
x = c()
## regression coofs
beta = rnorm(m)
## sum to zero?
if (sum.to.zero)
beta = beta - mean(beta)
for(i in 1:n) {
xx = rnorm(m, sd = 1)
eta = beta * xx
x = c(x, xx)
## intercept
idx = c(idx, rep(i, m))
## beta
jdx = c(jdx, 1:m)
p = exp(eta)
prob = p/sum(p)
yy = rmultinom(1, size = size, prob = prob)
y = c(y, yy)
}
## now we want to predict an outcome. ignore the intercept
idx = c(idx, rep(NA, m))
jdx = c(jdx, 1:m)
xx = rnorm(m, sd = 1)
x = c(x, xx)
eta.pred = beta * xx
prob.pred = exp(eta.pred)/sum(exp(eta.pred))
y = c(y, rep(NA, m))
formula = y ~ -1 +
## one intercept pr multinom observation
f(idx, model="iid",
hyper = list(
prec = list(
initial = log(0.000001),
fixed = TRUE))) +
## the covariates: add them as the 'weights' in the second
## argument
f(jdx, x, model="iid",
## beta's sum to zero?
constr = sum.to.zero,
hyper = list(
prec = list(
initial = log(0.001),
fixed = TRUE)))
r = inla(formula,
family = "poisson",
data = data.frame(y, idx, jdx, x),
## need this for the prediction
control.compute = list(config=TRUE))
print(cbind(estimate = r$summary.random$jdx$mean, truth = beta))
## do the predictons. The linear predictors we want are index n*m+1:m
## (yes, this is somewhat tricky...)
nsample = 100
xx = inla.posterior.sample(nsample, r)
## (or use target = n*m+1:m)
target = c("Predictor.6001", "Predictor.6002", "Predictor.6003")
prob = matrix(NA, nsample, 3)
for(i in 1:nsample) {
eta = xx[[i]]$latent[target, 1]
prob[i, ] = exp(eta) / sum(exp(eta))
}
prob.est = apply(prob,2,mean)
print(rbind(predicted = prob.est,
truth = prob.pred,
abs.error = abs(prob.est-prob.pred)))
16. I have some linear combinations of the nodes in the latent field that I want to compute the posterior marginal of, is that possible?
Yes! These are called 'linear combinations'. There are handy functions, 'inla.make.lincomb()' and 'inla.make.lincombs()', to define one or many such linear combinations. Single linear combinations made by using 'inla.make.lincomb()' can easily be joined into many. Its use is easiest explained using a rather long example...
Control variables for linear combinations, are defined in ?control.lincomb.
Here is the example, that explains these features.
## A simple model
n = 100
a = rnorm(n)
b = rnorm(n)
idx = 1:n
y = rnorm(n) + a + b
formula = y ~ 1 + a + b + f(idx, model="iid")
## assume we want to compute the posterior for
##
## 2 * beta_a + 3 * beta_b + idx[1] - idx[2]
##
## which we specify as follows (and giving it a unique name)
lc1 = inla.make.lincomb(a=2, b=3, idx = c(1,-1,rep(NA,n-2)))
names(lc1) = "lc1"
## strictly speaking, it is sufficient to use `idx = c(1,-1)', as the
## remaining idx's are not used in any case.
r = inla(formula, data = data.frame(a,b,y),
##
## add the linear combinations here
lincomb = lc1,
##
## force noise variance to be essiatially zero
##
control.family = list(initial=10, fixed=TRUE),
## (this is the default option)
control.inla = list(lincomb.derived.only=TRUE))
## to verify the result, we can compare the mean but the variance and
## marginal cannot be computed from the simpler marginals alone.
lc1.1 = 2 * r$summary.fixed["a", "mean"] + 3 * r$summary.fixed["b", "mean"] +
r$summary.random$idx$mean[1] - r$summary.random$idx$mean[2]
lc1.2= r$summary.lincomb.derived$mean
print(c(lc1.1 = lc1.1, lc1.2 = lc1.2))
## the marginals are available as `r$marginals.lincomb$...'.
## we can derive the lincomb above more accurately at a higher cost,
## by adding a new node to the graph, using derived.only=FALSE. Thus
## the lincomb can be derived using the simplified or full laplace
## approximation. For the Gaussian case here, they are all the same...
r = inla(formula, data = data.frame(a,b,y),
lincomb = lc1,
control.family = list(initial=10, fixed=TRUE),
control.inla = list(lincomb.derived.only=FALSE))
lc1.3= r$summary.lincomb$mean
print(c(lc1.1 = lc1.1, lc1.2 = lc1.2, lc1.3 = lc1.3))
## let us compute the posterior for
##
## linear.predictor[2] + intercept
##
## which we specify as follows (and giving it a unique name).
##
## NOTE: The linear predictor has a predefined name = "Predictor", and
## the intercept is named "(Intercept)" (as lm() does, don't blame
## me!).
lc2 = inla.make.lincomb(Predictor = c(NA,1, rep(NA,n-2)), "(Intercept)" = 1)
names(lc2) = "lc2"
## same here, `Predictor = c(NA,1)' is sufficient. We need to spesify
## them all the way until the last !is.na().
## to use more than one linear combination, we can spesify more than
## one using the c() operator
all.lc = c(lc1,lc2)
r = inla(formula, data = data.frame(a,b,y),
lincomb = all.lc,
control.predictor = list(compute=TRUE),
control.family = list(initial=10, fixed=TRUE))
lc2.1 = r$summary.linear.predictor$mean[2] + r$summary.fixed["(Intercept)", "mean"]
lc2.2 = r$summary.lincomb.derived$mean
print(c(lc2.1, lc2.1))
## There is an another function which is handy for specifying many
## linear combinations at once, that is inla.make.lincombs() [note the
## plural 's']. Here each 'row' define one linear combination
##
## let wa and wb be vectors, and we want to compute the marginals for
##
## beta_a * wa[i] + beta_b * wb[i], for i=1..m. this is done
## conveniently as follows
m = 10
wa = runif(m)
wb = runif(m)
lc.many = inla.make.lincombs(a = wa, b=wb)
## we can give them names as well, but there are also default names, like
print(names(lc.many))
## if we want to join in the previous ones, as well, we can add them
## using c()
r = inla(formula, data = data.frame(a,b,y),
lincomb = c(lc.many, all.lc),
control.family = list(initial=10, fixed=TRUE))
print(r$summary.lincomb.derived)
## if 'wa' and 'wb' is defined in a matrix, like
A = cbind(a=wa, b=wb)
## then we can define the same lincombs using inla.uncbind(),
lc.many = inla.make.lincombs( inla.uncbind(A) )
## we can join this trick with "(Intercept) = wi" using (do not ask
## why it is like this: read the code instead. If someone wants to fix
## it, please do!)
wi = runif(m)
lc.most = inla.make.lincombs( c( inla.uncbind(A), list( "(Intercept)" = wi)) )
## terms like 'idx' above, can be added as idx = IDX into
## inla.make.lincombs(), where IDX is a matrix. Again, each column of
## the arguments define one linear combination.
There is a further option available for the derived linear combinations, that is the option to compute also the posterior correlation matrix between all the linear combinations. To activate this option, use
control.inla = list(lincomb.derived.correlation.matrix = TRUE)
and you will find the resulting posterior correlation matrix as
result$misc$lincomb.derived.correlation.matrix
Here is a small example where we compute the correlation matrix for the predicted values of a hidden AR(1) model with an intercept.
n = 100
nPred = 10
phi = 0.9
x = arima.sim(n, model = list(ar=phi)) * sqrt(1-phi^2)
y = 1 + x + rnorm(n, sd=0.1)
time = 1:(n + nPred)
Y = c(y, rep(NA, nPred))
formula = Y ~ 1 + f(time, model="ar1")
## make linear combinations which are the nPred linear predictors
B = matrix(NA, nPred, n+nPred)
for(i in 1:nPred) {
B[i, n+i] = 1
}
lcs = inla.make.lincombs(Predictor = B)
r = inla(formula, data = data.frame(Y, time),
control.predictor = list(compute=TRUE),
lincomb = lcs,
control.inla = list(lincomb.derived.correlation.matrix=TRUE))
print(r$misc$lincomb.derived.correlation.matrix)
Here are some of the details: The default option, is the compute the marginal for the lincomb, a^Tx, say, where 'a' are the weights, and 'x' is the latent field, as follows. Conditionally on the hyperparameters, we assume the marginal is Gaussian with mean a^TE(x), where E(x) is the mean for x computed with either one of the approximations (gaussian, simplified laplace, laplace), and the variance is a^TQa, where Q is the precision matrix for the Gaussian approximation. Corrections are needed of'course, it there are linear constraints. So, unconditionally on the hyperparameters, the marginal for a^Tx is a mixture of Gaussians. Note that doing it this way, we achive that the lincomb's does not affect the graph of the latent field, and the marginals for the lincomb's can be computed in parallel. All this means, that you can have thousands of (sparse) linear combinations...
17. I tried one of the examples on this site, and I get this weird error....
The documentation is kept up to date, at least we try to, with the TEST version of the software. The version you'll get installing the default package will likely be an older version (but is likely more stable). To install the TEST version, which is updated often several times a week, do
inla.upgrade(testing=TRUE)
You may revert back to the older version, doing inla.upgrade().
18. I want to compute/approximate the joint density of some components of the latent field, how can I do that?
The easiest approach is to define these components as linear.combinations, and then use the control.inla = list(lincomb.derived.correlation.matrix = TRUE) option, see the related FAQ question about linear combinations. In this case you get the posterior correlation matrix, which you can use as a Gaussian copula.
19. INLA seems to work great for near all cases, but are there cases where INLA is known to have problems?
The methodology needs the full conditional density for the latent field to be "near" Gaussian. This is usually achived by either replications or smoothing/"borrowing strength". A simple example which do not have this, is the following:
n = 100
u = rnorm(n)
eta = 1 + u
p = exp(eta)/(1+exp(eta))
y = rbinom(n, size=1, prob = p)
idx = 1:n
result = inla( y ~ 1 + f(idx, model="iid"), data =data.frame(y,idx), family = "binomial", Ntrials = 1)
For each binary observation there is an iid "random effect" `u', and there is no smoothing/``borrowing strength'' (apart from the weak intercept). If you plot the loglikelihood for eta for y=1, say, then its an increasing function for increasing eta, so the likelihood itself would like eta = infinity. With an unknown precision for `u', we run into problems; INLA has a tendency to estimate a to high precision for 'u'.
However, it must be noted that the model is almost singular and you'll have a strong prior sensitivity in the (exact) results as well. There is a similar discussion in here as well for the Salamander data example.
April 2015: A new correction term is now available which improve these cases; See this post for more details.
20. I can plot the results using 'plot(result)' but how can I save the plots as pdf or postscript files one by one?
There is an option to plot (spesific for iNLA),
plot(result, single = TRUE)
which will not group the plots. Using this option, you can now just 'print' the plot on a spesific window, using
dev.print(pdf, file="plot.pdf")
for example.
21. I have a shapefile. How can I fit a spatial model with R-INLA?
You will need to import the shapefile into R, create the adjacency matrix and then fit your model. The following
code shows an example based on the North Carolina SIDS data included in the spdep package:
#Load libraries
library(spdep)
library(INLA)
#Load data from a shapefile included in the spdep package
nc.sids <- readShapePoly(system.file("etc/shapes/sids.shp", package="spdep")[1])
#Create adjacency matrix
nc.nb <- poly2nb(nc.sids)
#Compute expted number of cases
nc.sids$EXP<-nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74)
#Compute proportion of non-white births
nc.sids$NWPROP<-nc.sids$NWBIR74/nc.sids$BIR74
#Convert the adjacency matrix into a file in the INLA format
nb2INLA("nc.adj", nc.nb)
#Create areas IDs to match the values in nc.adj
nc.sids$ID<-1:100
m1<-inla(SID74~NWPROP+f(nc.sids$ID, model="besag", graph="nc.adj"),
family="poisson", E=nc.sids$EXP, data=as.data.frame(nc.sids),
control.predictor=list(compute=TRUE))
#Alternatively, a sparse matrix can be used
adjmat<-as(nb2mat(nc.nb, style="B"), "dgTMatrix") #Binary adjacency matrix
m2<-inla(SID74~NWPROP+f(nc.sids$ID, model="besag", graph=adjmat),
family="poisson", E=nc.sids$EXP, data=as.data.frame(nc.sids),
control.predictor=list(compute=TRUE))
#Get realtive risk estimates
nc.sids$RR1<-m1$summary.fitted.values[,1]
nc.sids$RR2<-m2$summary.fitted.values[,1]
#Display relative risk estimates to show that both examples fit the same model
spplot(nc.sids, c("RR1", "RR2"))
22. How are the Deviance Information Criteria (DIC) and The Watanabe-Akaike information criterion (WAIC) computed?
The DIC criteria depends on the deviance, -2*log.likelihood, and we need to compute the posterior expectation of it, and evaluate it at the posterior expectation. The calculations are straight forward, but instead of evaluating the deviance at the posterior mean of all parameters, we evaluate the deviance at the posterior mean of the latent field (x), and the posterior mode of the hyperparameters (theta). The reason is that the posterior marginals for the hyperparameters, like the precision, say, can be highly skewed, so that the posterior expectation is not at all a good representation of location. This 'almost' corresponds to a 'good' reparameterisation of the hyperparameters, like from precision to log(precision), as the posterior marginal for log(precision) is much much more symmetric.
The WAIC criterion is defined and discussed in this tech-report and we follow their recomandation to use p.eff as in their equation (11).
In order to compute this criteria, use
r=inla(..., control.compute = list(dic = TRUE, waic = TRUE))
and then the results is available as r$dic and r$waic
23. Can I have the linear predictor from one model as a covariate in a different model?
Yes, this is possible. Essentially, you have to set the linear predictor for the first model equal to 'u', and then you can copy 'u' and use the scaling to get the regression coefficient.
A simple example will illustrate the idea:
## simple example
n = 1000
x1 = rnorm(n)
eta1 = 1 + x1
x2 = rnorm(n)
eta2 = 2 + 2*eta1 + 2*x2
y1 = rnorm(n, mean=eta1, sd = 0.01)
y2 = rnorm(n, mean=eta2, sd = 0.01)
## the trick is to create a vector 'u' (iid) which is equal to eta1, and then we can copy 'u' to
## create beta*u or beta*eta1. we do this by using 0 = eta1 -u + tiny.noise
formula = Y ~ -1 + intercept1 + X1 + intercept2 +
f(u, w, model="iid", hyper = list(prec = list(initial = -6, fixed=TRUE))) +
f(b.eta2, copy="u", hyper = list(beta = list(fixed = FALSE))) + X2
Y = matrix(NA, 3*n, 3)
## part 1: y1
intercept1 = rep(1, n)
X1 = x1
intercept2 = rep(NA, n)
u = rep(NA, n)
w = rep(NA, n)
b.eta2 = rep(NA, n)
X2 = rep(NA, n)
Y[1:n, 1] = y1
## part 2: 0 = eta1 - u + tiny.noise
intercept1 = c(intercept1, intercept1)
X1 = c(X1, x1)
intercept2 = c(intercept2, rep(NA, n))
u = c(u, 1:n)
w = c(w, rep(-1, n))
b.eta2 = c(b.eta2, rep(NA, n))
X2 = c(X2, rep(NA, n))
Y[n + 1:n, 2] = 0
## part 3: y2
intercept1 = c(intercept1, rep(NA, n))
X1 = c(X1, rep(NA, n))
intercept2 = c(intercept2, rep(1, n))
u = c(u, rep(NA, n))
w = c(w, rep(NA, n))
b.eta2 = c(b.eta2, 1:n)
X2 = c(X2, x2)
Y[2*n + 1:n, 3] = y2
r = inla(formula,
data = list(Y=Y, intercept1=intercept1, X1=X1,
intercept2=intercept2, u=u, w=w, b.eta2=b.eta2, X2=X2),
family = c("gaussian", "gaussian", "gaussian"),
control.family = list(
list(),
list(hyper = list(prec = list(initial = 10, fixed=TRUE))),
list()))