Licensed under the Creative Commons attribution-noncommercial license, http://creativecommons.org/licenses/by-nc/3.0/. Please share and remix noncommercially, mentioning its origin.
To get started, we must install pomp
and pompExamples
. Run the following to do so.
install.packages("pomp",repos="http://R-Forge.R-Project.org")
install.packages("pompExamples",repos="http://kinglab.eeb.lsa.umich.edu/R")
pomp
AKA: stochastic dynamical systems, (non-linear, non-Gaussian) state-space models, hidden Markov models.
Schematic of the structure of a POMP showing causal relations and direction of information flow. The key perspective to keep in mind is that the model is to be viewed as the process that generated the data. |
Another POMP model schematic, showing dependence among model variables. State variables,
The Ricker map describes the deterministic dynamics of a simple population:
Here,
We can model process noise in this system by making the growth rate
Let’s suppose that the Ricker model is our model for the dynamics of a real population. However, we cannot know the exact population density at any time, but only estimate it through sampling.
Let’s model measurement error by assuming the measurements,
In this equation,
pomp
The R
package pomp
provides facilities for
modeling POMPs, a toolbox of statistical inference methods for analyzing
data using POMPs, and a development platform for implmenting new POMP
inference methods. Let’s see how the model we’ve just described can be
implemented in pomp
. The file contains some daily measurements. Load and plot the data:
loc <- url("http://kinglab.eeb.lsa.umich.edu/SIMM/data/sample_data.csv")
dat <- read.csv(loc)
head(dat)
## day y
## 1 1 2
## 2 2 87
## 3 3 0
## 4 4 12
## 5 5 174
## 6 6 0
plot(y~day,data=dat,type='o')
The basic data-structure provided by
pomp
is the object of class pomp
, or, for short, pomp object.
It is a container that holds real or simulated data and a POMP model,
possibly together with other information such as model parameters, that
may be needed to do things with the model and data. We’ll construct a
pomp object to hold the data and the Ricker model; we’ll do this in
stages.
The call to construct a pomp object is, naturally enough, pomp
. Documentation on this function can be had by doing ?pomp
. Do class?pomp
to get documentation on the pomp class. Learn about the various things you can do once you have a pomp object by doing methods?pomp
and following the links therein. Read an overview of the package as a whole with links to its main features by doing package?pomp
. A complete index of the functions in pomp
is returned by the command library(help=pomp)
. Finally, the home page for the pomp
project is ; there you have access to the compleete source code, manuals, mailing lists, etc.
require(pomp)
po <- pomp(dat,times="day",t0=0)
The times
argument specifies that the column labelled “day” gives the measurement times; t0
is the “zero-time”, the time at which the state process will be
initialized. One can examine the pomp object in various ways, and
convert it into a data-frame:
plot(po)
print(po)
show(po)
time(po)
obs(po)
as.data.frame(po)
We can implement the Ricker model deterministic skeleton by writing a function that maps pomp
to be able to use it, this has to be a function of a certain form.
ricker.skel <- function (x, t, params, ...) {
r <- params["r"]
N <- x["N"]
unname(r*N*exp(-N))
}
[Note the need to remove the names in the last line (unname
). If we don’t do this, we have problems due to R
’s efforts to guess the correct name for the result.] We can then add this to the pomp object:
po <- pomp(po,skeleton=ricker.skel,skeleton.type='map')
The skeleton.type
argument tells pomp
that the skeleton is a discrete-time dynamical system (a map) rather than a continuous-time system (a vectorfield).
With just the skeleton defined, we are in a position to compute the trajectories of the deterministic skeleton at any point in parameter space. For example,
traj <- trajectory(po,params=c(N.0=1,r=12))
class(traj)
## [1] "array"
dim(traj)
## [1] 1 1 50
Sometimes it’s useful to obtain the results as a data-frame:
traj <- trajectory(po,params=c(N.0=1,r=12),as.data.frame=TRUE)
head(traj)
## N time traj
## 1 4.4145533 1 1
## 2 0.6409909 2 1
## 3 4.0518588 3 1
## 4 0.8455429 4 1
## 5 4.3561445 5 1
## 6 0.6705544 6 1
plot(N~time,data=traj,type='o')
If we know the state, pomp
, to avoid confusion, we use the term “trajectory” exclusively to refer to trajectories of a deterministic process. Thus, the trajectory
command iterates or integrates the deterministic skeleton forward in
time, returning the unique trajectory determined by the specified
parameters. When we want to speak about sample paths of a stochastic
process, we use the term simulation. Accordingly, the simulate
command always returns individual sample paths from the POMP. In
particular, we avoid “simulating a set of differential equations”,
preferring instead to speak of “integrating” the equatioins “computing
trajectories”.
We can put the stochastic Ricker model into the pomp object by
writing a function that simulates one realization of the stochastic
process, from an arbitary time
ricker.sim <- function (x, t, params, ...) {
e <- rnorm(n=1,mean=0,sd=params["sigma"])
c(N=unname(exp(log(params["r"])+log(x["N"])-x["N"]+e)))
}
po <- pomp(po,rprocess=discrete.time.sim(step.fun=ricker.sim,delta.t=1))
The discrete.time.sim
function generates a function suitable for use as the rprocess
argument to pomp
. It assumes that step.fun
is a function like our ricker.sim
that makes a single random draw from a stochastic state process. The time-step of the discrete-time process is delta.t
, here, 1 day.
At this point, we have what we need to simulate the stochastic Ricker model.
sim <- simulate(po,params=c(N.0=1,r=12,sigma=1),states=TRUE)
class(sim)
## [1] "array"
dim(sim)
## [1] 1 1 50
sim <- simulate(po,params=c(N.0=1,r=12,sigma=1),states=TRUE,
as.data.frame=TRUE)
plot(N~time,data=sim,type='o')
lines(N~time,data=traj,type='l',col='red')
We complete the specification of the POMP by specifying the measurement model. In accordance with the measurement model, we write
po <- pomp(po,measurement.model=y~pois(phi*N))
Note the syntax of the measurement.model
argument: R
must be able create a distribution function by prepending d
or r
to the right-hand side of the formula. In this case, dpois
and rpois
are distribution functions for the Poisson distribution.
Now we can simulate the whole POMP. First, let’s add some parameters to the pomp object:
po <- pomp(po,params=c(N.0=1,r=12,sigma=1,phi=10))
sim <- simulate(po)
class(sim)
## [1] "pomp"
## attr(,"package")
## [1] "pomp"
plot(sim)
sim <- simulate(po,nsim=3,as.data.frame=T)
head(sim)
## time y N sim
## 1 1 92 9.08913451 1
## 2 2 0 0.02480737 1
## 3 3 15 1.27878180 1
## 4 4 67 5.52928081 1
## 5 5 1 0.20530943 1
## 6 6 11 1.02230493 1
It frequently happens that it helps to estimate parameters on a scale
different from that on which they appear in the model. For example, the
parameters
pomp
provides facilities to make it easier to work with
parameter transformations. In particular, we can optionally specify two
transformation functions when we create a pomp
object. One
will take transform our parameters from the estimation scale to the
natural scale. the other will invert that operation, transforming the
parameters from the natural to the estimation scale. In the Ricker model
example, the natural-scale parameters are
pomp(
po,
parameter.transform=function(params,...) {
exp(params)
},
parameter.inv.transform=function(params,...) {
log(params)
}
) -> po
Note that the function given to the parameter.transform
argument transforms parameters to the natural scale while that given to the parameter.inv.transform
argument transforms them from
the natural scale. Operationally, the defining property of the natural
scale is that this is the scale on which the elementary functions rprocess
, rmeasure
, dprocess
, dmeasure
, skeleton
, and init.state
expect to see the parameters.
The parameter transformations come into play in the coef
and coef<-
methods for getting and setting parameters. If we have a vector of
parameters on the natural scale, one can set them as parameters of po
by doing
coef(po) <- c(r=44,phi=10,sigma=0.3,N.0=7)
and get the parameters by
theta <- coef(po)
On the other hand, given parameter values on the estimation scale, one can set the parameters by
coef(po,transform=TRUE) <- c(r=3.8,phi=2,sigma=-1,N.0=3)
and get them by
coef(po,transform=TRUE)
## r phi sigma N.0
## 3.8 2.0 -1.0 3.0
There is also a function partrans
that gives you access to the parameter transformation functions inside the pomp object. Do ?partrans
to read the documentation.
To be clear: the parameter-setting expression
coef(po) <- theta
assumes that theta
is a named vector of parameters on the natural scale, while
coef(po,transform=TRUE) <- theta
assumes that theta
is on the estimation scale.
One can also get or set parameters individually or in groups:
coef(po)
## r phi sigma N.0
## 44.7011845 7.3890561 0.3678794 20.0855369
coef(po,"sigma") <- 0.3
coef(po,"sigma",transform=TRUE)
## sigma
## -1.203973
coef(po,c("r","sigma"),transform=TRUE) <- c(2,0)
coef(po)
## r phi sigma N.0
## 7.389056 7.389056 1.000000 20.085537
Note that it’s the user’s responsibility to ensure that the parameter.transform
and parameter.inv.transform
are actually mutually inverse. A simple (but not foolproof) test of this is
theta <- coef(po)
## theta is on the natural scale
identical(
theta,
partrans(po,
partrans(po,theta,dir='inverse'),
dir='forward')
)
## [1] TRUE
theta <- coef(po,transform=TRUE)
## theta is on the estimation scale
identical(
theta,
partrans(po,
partrans(po,theta,dir='forward'),
dir='inverse')
)
## [1] TRUE
pomp
comes with several examples built in. These can be loaded with the data
command. For example,
pompExample(ricker)
## newly created object(s):
## ricker
loads a pomp object just like the one we’ve just defined, except that its internals are implemented in C and so are substantially faster. Later on, we’ll learn how to write pomp objects using C.
Reparameterize the Ricker model so that the units are standard by making the scaling of
Modify the pomp object we created above to reflect this reparameterization.
Modify the measurement model so that ?NegBinomial
for documentation on the negative binomial distribution.]
Create a pomp object encoding a partially-observed Hassell-May host-parasitoid model
Below is a diagram of the SIR model. The host population is divided
into three classes according to their infection status: S, susceptible
hosts; I, infected (and infectious) hosts; R, recovered and immune
hosts. Births result in new susceptibles and all individuals have a
common death rate
pomp
Earlier, we saw how to implement the deterministic skeleton of a compartmental model. Let’s revisit that example.
baseurl <- "http://kinglab.eeb.lsa.umich.edu/ICTPWID/SaoPaulo_2015/"
url <- paste0(baseurl,"data/boarding_school_influenza.csv")
bbs <- read.csv(url,comment.char="#",colClasses=c(date="Date"))
bbs <- subset(bbs,status=="confined",select=-status)
bbs$day <- as.numeric(bbs$date-min(bbs$date))
pomp(
dat,
times="day",
t0=0,
skeleton.type="vectorfield",
skeleton=function(t,x,params,...){
N <- sum(x[c("S","I","R")])
foi <- params["beta"]*x["I"]/N
terms <- c(
x["S"]*foi,
x["I"]*params["gamma"]
)
terms <- unname(terms)
c(
S=-terms[1],
I=terms[1]-terms[2],
R=terms[2]
)
}
) -> flu.sir
With flu.sir
as written, we are in a position to compute
trajectories for any set of parameters. In fact, let’s put some
parameters into the pomp object now:
flu.sir <- pomp(flu.sir,
params=c(gamma=1/3,beta=3,S.0=1600,I.0=1,R.0=0,cases.0=0))
To fit the model to the data, however, we need to think about the
process that actually generates the data. In other words, we need to
model the measurement process. The data are new infections per day, but
the SIR model as we’ve written doesn’t track new infections, only
numbers of S, I, and R individuals. We can keep track of the new
infections by adding a new state variable, cases
, that accumulates the number of recoveries:
pomp(
flu.sir,
skeleton=function(t,x,params,...){
N <- sum(x[c("S","I","R")])
foi <- params["beta"]*x["I"]/N
terms <- c(
x["S"]*foi,
x["I"]*params["gamma"]
)
terms <- unname(terms)
c(
S=-terms[1],
I=terms[1]-terms[2],
R=terms[2],
cases=terms[2] # d(cases)/dt
)
}
) -> flu.sir
Note we’ve only changed the skeleton
slot of flu.sir
; the rest of it remains the same. Compute the trajectories and plot the cases by doing:
x <- trajectory(flu.sir,as.data.frame=TRUE,times=seq(0,100))
plot(cases~time,data=x,type='o')
This is the cumulative count of cases. We can obtain the number of cases per day by telling pomp
that cases
is an accumulator variable using the zeronames
argument. This will result in the cases
being reset to zero immediately after each observation.
pomp(flu.sir,zeronames="cases") -> flu.sir
x <- trajectory(flu.sir)
plot(range(time(flu.sir)),
range(c(x["cases",,],obs(flu.sir))),
type='n',xlab="time",ylab="cases")
lines(time(flu.sir),obs(flu.sir),col='black',type='o')
lines(time(flu.sir),x["cases",,],col='red')
Now that we’ve got a plausible deterministic skeleton, let’s flesh out the model by specifying the stochastic state process. We’ll use an Euler-multinomial approximation. First, we’ll need a function that takes a single Euler step.
sir.proc.sim <- function (x, t, params, delta.t, ...) {
N <- sum(x[c("S","I","R")])
foi <- params["beta"]*x["I"]/N
c(
reulermultinom(n=1,size=x["S"],rate=foi,dt=delta.t),
reulermultinom(n=1,size=x["I"],rate=params["gamma"],dt=delta.t)
) -> trans
x[c("S","I","R","cases")]+c(
-trans[1],
trans[1]-trans[2],
trans[2],
trans[2]
)
}
flu.sir <- pomp(flu.sir,rprocess=euler.sim(sir.proc.sim,delta.t=1/20))
Note that, in the above, the Euler step size has been set to 1/20 day.
It remains to think about the measurement model. To begin with, let’s assume that the measurement errors are binomially distributed.
pomp(
flu.sir,
measurement.model=reports~binom(size=cases,prob=rho),
params=c(rho=0.5,gamma=1/3,beta=3,S.0=1600,R.0=0,I.0=1,cases.0=0)
) -> flu.sir
plot(simulate(flu.sir))
Once we start to try to estimate the parameters in this model, we’ll
need to come to terms with the fact that there are positivity
constraints on our parameters. As before, we can get around this by
log-transforming them. The reporting probability,
pomp(
flu.sir,
logvars=c("beta","gamma"),
logitvars=c("rho"),
parameter.inv.transform=function(params,logvars,logitvars,...){
params[logvars] <- log(params[logvars])
params[logitvars] <- qlogis(params[logitvars])
params
},
parameter.transform=function(params,logvars,logitvars,...){
params[logvars] <- exp(params[logvars])
params[logitvars] <- plogis(params[logitvars])
params
}
) -> flu.sir
Again, note that the parameter.transform
argument takes us to the natural scale (the parameter scale on which the skeleton and rprocess functions were written) and parameter.inv.transform
takes us from the natural scale to the estimation scale. You will not find anything about the arguments logvars
and logitvars
in the documentation of pomp
. Instead, you will find mention of pomp
’s ...
argument. “Unofficial” arguments given to pomp
are not ignored: they are stored within the pomp object in such a way
that they canbe used by the functions you write. In the above example,
the parameter transformation functions use logvars
and logitvars
.
Our initial conditions have a special constraint. In particular, pomp
allows us to choose an arbitrary parameterization of the initial conditions of the state process. We accomplish this via the initializer
argument to pomp
:
pomp(
flu.sir,
initializer=function(params,t0,...) {
fracs <- params[c("S.0","I.0","R.0")]
x <- unname(fracs/sum(fracs)*params["pop"])
c(
S=round(x[1]),
I=round(x[2]),
R=round(x[3]),
cases=0
)
},
logvars=c("beta","gamma","S.0","I.0","R.0"),
params=c(rho=0.5,gamma=1/3,beta=3,S.0=1,I.0=0.0006,R.0=0,pop=1600)
) -> flu.sir
Other initial-condition parameterizations are possible, but this symmetric one seems to be particularly useful for compartmental models.
pomp
Three examples implementing the SIR model are provided with pomp
. gillespie.sir
uses the Gillespie algorithm for simulations. euler.sir
uses the Euler-multinomial algorithm with a step-size of 1/52/20 yr. You can read about the euler.sir
example in the “Introduction to pomp
” vignette (vignette("intro_to_pomp")
). The bbs
example contains the data from the boarding school influenza outbreak
together with an SIR model which is just a bit more complex than the one
we constructed above. It has the deterministic skeleton
Here, beta.sd
; setting beta.sd <- 0
turns the extra-demographic stochasticity off. Finally, the measurement model that comes with bbs
is bbs
and inspect its default parameters by doing
pompExample(bbs)
## newly created object(s):
## bbs
coef(bbs)
## gamma mu iota beta beta.sd
## 0.3333333 0.0000000 0.0000000 1.4000000 0.0000000
## pop rho sigma S.0 I.0
## 1400.0000000 0.9000000 3.6000000 0.9990000 0.0010000
## R.0
## 0.0000000
Let’s return to considering a general state-space model. As before, let
i.e., computation of the likelihood requires summing over all possible values of the unobserved process at each time point. This is very hard to do, in general. Today, we will learn about Monte Carlo methods for evaluating this and other difficult integrals. An excellent technical reference on Monte Carlo techniques generally is Robert and Casella (2004).
Simulation refers to the generation of random variables. The general problem of simulation is: given a probability distribution
This method works for discrete or continuous scalar random variables. Let
The transformation method is very efficient in that we are guaranteed to obtain a valid
Region |
Diagram of the rejection method. We propose points by drawing them uniformly from the area under the graph of |
First, let’s suppose that we wish to simulate
There is another, seemingly trivial fact, that underlies the rejection method and that Robert and Casella (2004) refer to as the fundamental theorem of simulation. Let
So
This suggests the following rejection method for simulating an arbitrary random variable. Let
Monte Carlo integration is a technique developed for evaluating
integrals. First let’s look at the example of finding the area of a
region,
In the above, the indicator function
We’ve just shown that we can evaluate the integral of
Here, we’re relying on the assumption that the darts fall uniformly on
Moreover, we can estimate the error in this approximation, because the empirical variance
Sometimes it is difficult to sample directly from the distribution of
Specifically, suppose we wish to compute
Since
Obtaining accurate estimates requires some thought to the importance distribution
We can imagine using Monte Carlo integration for computing the likelihood of a state space model,
How shall we choose our trajectories? One idea would be to choose
them so as to simplify the computation. If we choose them such that
This implies that if we generate trajectories by simulation, all we have to do is compute the likelihood of the data with given each trajectory and average.
Let’s go back to the British boarding school influenza outbreak to see what this would look like. Load a pomp
object containing the boarding-school flu data by doing pompExample(bbs)
. The process model here is just the SIR model we discussed last time. The measurement model is
require(pomp)
pompExample(bbs)
## newly created object(s):
## bbs
plot(bbs)
Let’s generate a bunch of simulated trajectories at some particular point in parameter space.
coef(bbs)
## gamma mu iota beta beta.sd
## 0.3333333 0.0000000 0.0000000 1.4000000 0.0000000
## pop rho sigma S.0 I.0
## 1400.0000000 0.9000000 3.6000000 0.9990000 0.0010000
## R.0
## 0.0000000
x <- simulate(bbs,nsim=60,states=T)
matplot(time(bbs),t(x["cases",,]),type='l',lty=1,
xlab="time",ylab="infections",bty='l')
We can use the function dmeasure
to evaluate the log likelihood of the data given the states, the model, and the parameters:
ell <- dmeasure(bbs,y=obs(bbs),x=x,times=time(bbs),
params=parmat(coef(bbs),nrep=60),
log=T)
dim(ell)
## [1] 60 14
According to the equation above, we should sum up the log likelihoods across time:
ell <- apply(ell,1,sum); ell
## [1] -Inf -Inf -Inf -Inf -Inf -98.49835 -94.53501
## [8] -94.33567 -Inf -Inf -Inf -98.20784 -99.72224 -94.81910
## [15] -95.23027 -94.73599 -Inf -Inf -Inf -95.49592 -Inf
## [22] -Inf -95.24240 -Inf -Inf -Inf -Inf -94.68817
## [29] -Inf -Inf -Inf -Inf -Inf -Inf -Inf
## [36] -Inf -Inf -Inf -Inf -Inf -Inf -Inf
## [43] -96.49108 -Inf -Inf -Inf -Inf -Inf -Inf
## [50] -Inf -Inf -95.70486 -Inf -94.61263 -94.61592 -Inf
## [57] -Inf -94.96263 -Inf -Inf
The variance of these estimates is very high and therefore the estimated likelihood is very imprecise:
loglik <- mean(ell); loglik
## [1] -Inf
loglik.se <- sd(ell)/sqrt(length(ell)); loglik.se
## [1] NaN
We are going to need very many simulations to get an estimate of the likelihood sufficiently precise to be of any use in parameter estimation or model selection.
What’s the problem? Essentially, way too many of the trajectories don’t pass near the data. Moreover, once a trajectory diverges from the data, it almost never comes back. This is a consequence of the fact that we are proposing trajectories in a way that is completely unconditional on the data. The problem will only get worse with longer data sets!
We can arrive at a more efficient algorithm by factorizing the likelihood in a different way.
This suggests that we keep track of two key distributions. We’ll refer to the distribution of
Let’s use Monte Carlo techniques to estimate the sums. Suppose
This is known as the sequential Monte Carlo algorithm or the particle filter. Key references include Kitagawa (1987), Arulampalam et al. (2002) and the book Doucet et al. (2001).
pomp
In pomp
, the basic particle filter is implemented in the command pfilter
. We must choose the number of particles to use by setting the Np
argument.
pf <- pfilter(bbs,Np=1000)
logLik(pf)
## [1] -96.79192
We can run a few particle filters to get an estimate of the Monte Carlo variability:
pf <- replicate(n=10,pfilter(bbs,Np=1000))
ll <- sapply(pf,logLik)
logmeanexp(ll,se=TRUE)
## se
## -96.71380549 0.05242974
Note that we’re careful here to counteract Jensen’s inequality. The particle filter gives us an unbiased estimate of the likelihood, not of the log-likelihood.
To get an idea of what the likelihood surface looks like in the neighborhood of the default parameter set supplied by bbs
, we can construct some likelihood slices. We’ll make slices in the
require(pomp)
pompExample(bbs)
center <- coef(bbs)
p <- sliceDesign(
coef(bbs),
beta=rep(seq(from=0.5,to=8,length=20),each=3),
gamma=rep(seq(from=0.05,to=2,length=20),each=3)
)
p$loglik <- NA
for (k in seq_len(nrow(p))) {
pf <- pfilter(bbs,params=unlist(p[k,]),Np=1000)
p$loglik[k] <- logLik(pf)
}
require(ggplot2)
require(grid)
pl1 <- ggplot(
data=subset(p,slice=="beta"),
mapping=aes(x=beta,y=loglik)
)+
geom_point()+
geom_vline(xintercept=center["beta"],color='red')+
geom_smooth(method="loess",span=0.25)+
xlab("beta")
pl2 <- ggplot(
data=subset(p,slice=="gamma"),
mapping=aes(x=gamma,y=loglik)
)+
geom_point()+
geom_vline(xintercept=center["gamma"],color='red')+
geom_smooth(method="loess",span=0.25)+
xlab("gamma")
print(pl1,vp=viewport(x=unit(0.25,"npc"),width=unit(0.5,"npc")))
print(pl2,vp=viewport(x=unit(0.75,"npc"),width=unit(0.5,"npc")))
Construct likelihood slices on
Clearly, the default parameter set is not particularly close to the
MLE. One way to find the MLE is to try optimizing the estimated
likelihood directly. There are many optimization algorithms to choose
from, and many implemented in R
.
Two issues arise immediately. First, the particle filter gives us a
stochastic estimate of the likelihood. We can reduce this variability by
making Np
larger, but we cannot make it go away. If we use
a deterministic optimizer (i.e., one that assumes the objective
function is evaluated deterministically), then we must control this
variability somehow. For example, we can fix the seed of the
pseudo-random number generator. A side effect will be that the objective
function becomes jagged, marked by many small local knolls and pits.
Alternatively, we can use a stochastic optimization algorithm, with
which we will be only be able to obtain estimates of our MLE. This is
the trade-off between a noisy and a rough objective function. Second,
because the particle filter gives us just an estimate of the likelihood
and no information about the derivative, we must choose an algorithm
that is “derivative-free”. There are many such, but we can expect less
efficiency than would be possible with derivative information. Note that
finite differencing is not an especially promising way of constructing
derivatives. The price would be a
Here, let’s opt for deterministic optimization of a rough function. We’ll try using optim
’s default method: Nelder-Mead.
require(pomp)
pompExample(bbs)
## newly created object(s):
## bbs
set.seed(837065482)
neg.ll <- function (par, est, ...) {
## parameters to be estimated are named in 'est'
allpars <- coef(bbs,transform=TRUE)
allpars[est] <- par
try(
pfilter(
bbs,
params=partrans(bbs,allpars,dir="forward"),
...
)
) -> pf
if (inherits(pf,"try-error")) {
1e10 ## a big number
} else {
-logLik(pf)
}
}
require(plyr)
## use Nelder-Mead with fixed RNG seed
fit <- optim(
par=c(-1.1, 0.33, 2.2, 1.28),
est=c("gamma","beta","rho","sigma"),
Np=200,
fn=neg.ll,
seed=3488755,
method="Nelder-Mead",
control=list(maxit=400,trace=0)
)
mle <- bbs
coef(mle,c("gamma","beta","rho","sigma"),trans=T) <- fit$par
coef(mle,c("gamma","beta","rho","sigma")) ## point estimate
## gamma beta rho sigma
## 0.5125661 2.0198047 0.9681525 0.2501303
dat <- as.data.frame(bbs)
dat$type <- "data"
sims <- simulate(mle,nsim=10,as.data.frame=TRUE)
sims$type <- "simulation"
sims <- merge(dat,sims,all=T)
lls <- replicate(n=5,logLik(pfilter(mle,Np=200)))
ll <- logmeanexp(lls,se=TRUE)
ll.se <- ll[2]
ll <- ll[1]
Some simulations at these parameters are shown next:
require(ggplot2)
pl <- ggplot(data=sims,
mapping=aes(x=time,y=reports,group=sim,color=type)
)+
geom_line()+
geom_text(x=9,y=320,
label=paste("log~L==",round(ll,1),"%+-%",round(ll.se,1)),
color="black",
parse=T,hjust=0)
print(pl)
Use simulated annealing to maximize the likelihood. Be sure to try several starting guesses.
Construct likelihood slices on
Iterated filtering is an algorithm based on sequential Monte Carlo that shares many features of the Liu and West (2001)
method. However, it is a likelihood maximization method and so is
typically used in a frequentist mode, thus avoiding many of the
complexities associated with Bayesian inference. Specifically, iterated
filtering is an algorithm for maximizing the likelihood of a partially
observed Markov process (POMP) model computed via sequential Monte Carlo
(AKA particle filtering). The basic idea of iterated filtering is the
following. We wish to compute the likelihood
The idea will be to use this approximate gradient to climb the
likelihood surface, while at the same time reducing the added
variability so that
In the method of Liu and West (2001),
we make one particle filtering pass through the data. In iterated
filtering we make many passes. After each pass, we obtain an estimate of
the likelihood and its gradient. We take a step of judicious size in
the direction of the gradient, reduce the added variability,
For
Do particle filtering on the random-parameters model: for
Take
The iterated filtering theorem states that, if
We’ve assumed that the observation times
The random-walk s.d. can (and typically should) be different for each parameter. One can take
In the above, we draw the initial set of particles from a normal distribution; one could use another distribution. Likewise, the assumption that the parameter perturbation be normal is not needed.
The basic MIF algorithm described above does not work particularly
well for initial condition parameters. This is because the information
pertaining to such parameters is not evenly distributed through the
data. Rather, it is concentrated in the first part of the time series.
As implemented in pomp
, mif
treats initial condition parameters differently. Specifically, for initial condition parameters, step 3e is replaced by:
3e. Let
That is, we update the initial condition parameters at each MIF iteration by their filtering mean at some initial condition lag
The cooling factor we use here is geometric (
We might want to let the number of particles change through time; there is nothing stopping us from doing this and there are advantages to be gained.
There are other reasonable choices for step 3e. In particular, one might instead take
Ionides et al. (2006) first present the method; a thorough development of its theoretical foundation is given by Ionides et al. (2011). Applications of the method include studies of cholera (King et al. 2008,Bretó et al. (2009)), measles (He et al. 2010), and malaria (Laneri et al. 2010). Although the applications have for the most part been to infectious disease systems, this does not reflect any limitation of the method.
In pomp
, iterated filtering is implemented via the mif
command. The following shows the basic usage. Here, we are trying to maximize the likelihood for the Ricker model example.
require(pomp)
pompExample(ricker)
## newly created object(s):
## ricker
set.seed(1251928907L)
dat <- window(ricker,end=20)
estpars <- c("r","sigma","phi")
coef(dat)
## r sigma phi N.0 e.0
## 44.70118 0.30000 10.00000 7.00000 0.00000
guess <- c(r=exp(5),sigma=1,phi=20,N.0=7,e.0=0)
mf <- mif(dat,start=guess,
transform=TRUE,
Nmif=100,Np=1000,
cooling.factor=0.99,var.factor=2,ic.lag=3,
rw.sd=c(r=0.1,sigma=0.1,phi=0.1))
We can have a look at the results of this by doing
plot(mf)
Since we seem to be still climbing, let’s continue:
mf <- continue(mf,Nmif=300)
To really see if we’re converging, we should run several MIFs from different starting guesses.
replicate(
n=5,
{
guess <- coef(dat)
guess[estpars] <- rlnorm(
n=3,
meanlog=log(guess[estpars]),
sdlog=0.5
)
mif(
dat,
Nmif=300,
start=guess,
transform=TRUE,
pars=estpars,
Np=1000,
var.factor=2,
ic.lag=3,
cooling.factor=0.99,
rw.sd=c(r=0.1,sigma=0.1,phi=0.1)
)
}
) -> mf
We can then compare the results using compare.mif(mf)
. Here’s a prettier version of the results.
To compute the likelihoods at these putative MLEs, we’ll need to run pfilter
. As before, we’ll want to do this several times to get an idea of the Monte Carlo error in the likelihood computation.
pf <- replicate(n=10,lapply(mf,pfilter))
ll <- lapply(pf,logLik); dim(ll) <- dim(pf)
ll <- apply(ll,1,
function(x){
x <- as.numeric(x)
m <- mean(x)
loglik <- m+log(mean(exp(x-m)));
se <- sd(exp(x-m))/exp(loglik-m)
c(loglik=loglik,loglik.se=se)
}
)
mletable <- rbind(sapply(mf,coef,c("r","sigma","phi")),ll)
colnames(mletable) <- paste("mif",seq_along(mf))
truth <- coef(dat,c("r","sigma","phi"))
ll.truth <- replicate(n=10,logLik(pfilter(dat,Np=1000)))
m <- mean(ll.truth)
loglik <- m+log(mean(exp(ll.truth-m)))
se <- sd(exp(ll.truth-m))/exp(loglik-m)
signif(cbind(mletable,truth=c(truth,loglik,se)),digits=3)
## mif 1 mif 2 mif 3 mif 4 mif 5 truth
## r 45.5000 45.500 45.5000 45.7000 45.7000 44.700
## sigma 0.0397 0.036 0.0391 0.0399 0.0445 0.300
## phi 10.1000 10.100 10.2000 10.2000 10.2000 10.000
## loglik -51.0000 -50.700 -50.7000 -50.9000 -51.0000 -57.200
## loglik.se 0.2540 0.323 0.2740 0.2840 0.3100 0.157
Experiment with estimating SIR model parameters for the British
boarding school data using iterated filtering. Try using 500–1000
particles (Np
), a cooling factor of 0.999, a variance multiplier of 1–4 (var.factor
), and random-walk intensities (rw.sd
) on the order of 0.01-0.02. The lag for the initial-value parameter smoother can be set to the length of the data set (ic.lag=14
).
Arulampalam, M. S., S. Maskell, N. Gordon, and T. Clapp. 2002. A tutorial on particle filters for online nonlinear, non-Gaussian Bayesian tracking. IEEE Transactions on Signal Processing 50:174–188.
Bretó, C., and E. L. Ionides. 2011. Compound markov counting processes and their applications to modeling infinitesimally over-dispersed systems. Stochastic Processes and their Applications 121:2571–2591.
Bretó, C., D. He, E. L. Ionides, and A. A. King. 2009. Time series analysis via mechanistic models. Annals of Applied Statistics 3:319–348.
Doucet, A., N. de Freitas, and N. Gordon, editors. 2001. Sequential Monte Carlo Methods in Practice. Springer-VerlagNew York.
He, D., E. L. Ionides, and A. A. King. 2010. Plug-and-play inference for disease dynamics: measles in large and small populations as a case study. Journal of The Royal Society Interface 7:271–283.
Ionides, E. L., A. Bhadra, Y. Atchadé, and A. A. King. 2011. Iterated filtering. Annals of Statistics 39:1776–1802.
Ionides, E. L., C. Bretó, and A. A. King. 2006. Inference for nonlinear dynamical systems. Proceedings of the National Academy of Sciences of the U.S.A. 103:18438–18443.
King, A. A., E. L. Ionides, M. Pascual, and M. J. Bouma. 2008. Inapparent infections and cholera dynamics. Nature 454:877–880.
Kitagawa, G. 1987. Non-Gaussian state-space modeling of nonstationary time series. Journal of the American Statistical Association 82:1032–1041.
Laneri, K., A. Bhadra, E. L. Ionides, M. Bouma, R. C. Dhiman, R. S. Yadav, and M. Pascual. 2010. Forcing versus feedback: epidemic malaria and monsoon rains in northwest India. PLoS Computational Biology 6:e1000898.
Liu, J., and M. West. 2001. Combining Parameter and State Estimation in Simulation-Based Filtering. Pages 197–224 in A. Doucet, N. de Freitas, and N. J. Gordon, editors. Sequential Monte Carlo methods in practice. Springer, New York.
Robert, C. P., and G. Casella. 2004. Monte Carlo Statistical Methods. 2nd editions. Springer-Verlag.