Skip to contents

In this document we show how to use the likelihood method to obtain function handlers for the objective function, and gradient, (and hessian if using a Kalman filter), for instance to use another optimization algorithm than stats::nlminb.

Simulate from the Ornstein-Uhlenbeck process


We use the common Ornstein-Uhlenbeck process to showcase the use of likelihood.

dXt=θ(μXt)dt+σXdBt \mathrm{d}X_{t} = \theta (\mu - X_{t}) \, \mathrm{d}t \, + \sigma_{X} \, \mathrm{d}B_{t}

Ytk=Xtk+etk,etk𝒩(0,σY2) Y_{t_{k}} = X_{t_{k}} + e_{t_{k}}, \qquad e_{t_{k}} \sim \mathcal{N}\left(0,\sigma_{Y}^{2}\right) We first create data by simulating the process

# Simulate data using Euler Maruyama
set.seed(10)
theta=10; mu=1; sigma_x=1; sigma_y=1e-1
# 
dt.sim = 1e-3
t.sim = seq(0,1,by=dt.sim)
dw = rnorm(length(t.sim)-1,sd=sqrt(dt.sim))
x = 3
for(i in 1:(length(t.sim)-1)) {
  x[i+1] = x[i] + theta*(mu-x[i])*dt.sim + sigma_x*dw[i]
}

# Extract observations and add noise
dt.obs = 1e-2
t.obs = seq(0,1,by=dt.obs)
y = x[t.sim %in% t.obs] + sigma_y * rnorm(length(t.obs))

# Create data
.data = data.frame(
  t = t.obs,
  y = y
)

Construct model object


We now construct the ctsmTMB model object

# Create model object
obj = ctsmTMB$new()

# Add system equations
obj$addSystem(
  dx ~ theta * (mu-x) * dt + sigma_x*dw
)

# Add observation equations
obj$addObs(
  y ~ x
)

# Set observation equation variances
obj$setVariance(
  y ~ sigma_y^2
)

# Specify algebraic relations
obj$setAlgebraics(
  theta ~ exp(logtheta),
  sigma_x ~ exp(logsigma_x),
  sigma_y ~ exp(logsigma_y)
)

# Specify parameter initial values and lower/upper bounds in estimation
obj$setParameter(
  logtheta   = log(c(initial = 5,    lower = 0,    upper = 20)),
  mu         = c(    initial = 0,    lower = -10,  upper = 10),
  logsigma_x = log(c(initial = 1e-1, lower = 1e-5, upper = 5)),
  logsigma_y = log(c(initial = 1e-1, lower = 1e-5, upper = 5))
)

# Set initial state mean and covariance
obj$setInitialState(list(x[1], 1e-1*diag(1)))

Estimation


We are in principle ready to call the estimate method to run the optimization scheme using the built-in optimization which uses stats::nlminb i.e.

fit = obj$estimate(.data)
## Checking model components...
## Checking and setting data...
## Constructing objective function and derivative tables...
## Minimizing the negative log-likelihood...
##   0:     921.77031:  1.60944  0.00000 -2.30259 -2.30259
##  10:    -38.611349:  2.53386  1.03053 0.151828 -2.34627
##   Optimization finished!:
##             Elapsed time: 0.004 seconds.
##             The objective value is: -3.934664e+01
##             The maximum gradient component is: 1.3e-04
##             The convergence message is: relative convergence (4)
##             Iterations: 19
##             Evaluations: Fun: 27 Grad: 20
##             See stats::nlminb for available tolerance/control arguments.
## Returning results...
## Finished!

Inside the package we optimise the objective function with respect to the fixed parameters using the construction function handlers from TMB::MakeADFun and parsing them to stats::nlminb i.e.

nll = TMB::MakeADFun(...)
opt = stats::nlminb(start=nll$par, objective=nll$fn, grad=nll$gr, hessian=nll$he)

Extract function handlers


The likelihood method allows you to retrieve the nll object that holds the negative log-likelihood, and its derivatives. The method takes arguments similar to those of estimate.

nll = obj$likelihood(.data)
## Finished!

The initial parameters (supplied by the user) are stored here

nll$par
##   logtheta         mu logsigma_x logsigma_y 
##   1.609438   0.000000  -2.302585  -2.302585

The objective function can be evaluated by

nll$fn(nll$par)
## [1] 921.7703

The gradient can be evaluated by

nll$gr(nll$par)
##          [,1]      [,2]      [,3]      [,4]
## [1,] 1382.854 -1557.574 -1191.374 -820.9253

The hessian can be evaluated by

nll$he(nll$par)
##           [,1]       [,2]       [,3]       [,4]
## [1,]  2226.705 -2859.2069 -1636.0560 -1136.3013
## [2,] -2859.207  1663.1457  2251.5926   865.5807
## [3,] -1636.056  2251.5926   905.1284  1486.3607
## [4,] -1136.301   865.5807  1486.3607   346.7711

We can now use these to optimize the function using e.g. stats::optim instead.

Extract parameter lower/upper bounds


You can extract the parameter bounds specified when calling setParameter() method by using the getParameters method (note that nll$par and pars$initial are identical).

pars = obj$getParameters()
print(pars)
##            type   estimate   initial     lower     upper
## logtheta   free  2.8028461  1.609438      -Inf  2.995732
## mu         free  1.0774793  0.000000 -10.00000 10.000000
## logsigma_x free  0.2024819 -2.302585 -11.51293  1.609438
## logsigma_y free -2.3269231 -2.302585 -11.51293  1.609438

Optimize manually using stats::optim

We supply the initial parameter values, objective function handler and gradient handler, and parameter bounds to optim.

opt = stats::optim(par=nll$par, 
                   fn=nll$fn, 
                   gr=nll$gr, 
                   method="L-BFGS-B", 
                   lower=pars$lower, 
                   upper=pars$upper)

Compare results between the two optimizers


Lets compare the results from using stats::optim with the extracted function handler versus the internal optimisation that uses stats::nlminb stored in fit:

# Estimated parameters
data.frame(external=opt$par, internal=fit$par.fixed)
##              external   internal
## logtheta    2.8028428  2.8028461
## mu          1.0774788  1.0774793
## logsigma_x  0.2024796  0.2024819
## logsigma_y -2.3269222 -2.3269231
# Neg. Log-Likelihood
data.frame(external=opt$value, internal=fit$nll)
##    external  internal
## 1 -39.34664 -39.34664
# Gradient components
data.frame(external=t(nll$gr(opt$par)), internal=t(nll$gr(fit$par.fixed)))
##        external      internal
## 1 -5.284383e-05 -2.269770e-06
## 2 -1.491876e-04 -1.337658e-04
## 3 -4.592109e-05  2.628658e-05
## 4 -2.656532e-05 -2.239568e-06