Class Optimizer

Nested Relationships

Class Documentation

class Optimizer

A numerical optimizer customized for least-squares problems with Bayesian priors.

The algorithm used by Optimizer combines the Gauss-Newton approach of approximating the second-derivative (Hessian) matrix as the inner product of the Jacobian of the residuals, while maintaining a matrix of corrections to this to account for large residuals, which is updated using a symmetric rank-1 (SR1) secant formula. We assume the prior has analytic first and second derivatives, but use numerical derivatives to compute the Jacobian of the residuals at every step. A trust region approach is used to ensure global convergence.

We consider the function \(f(x)\) we wish to optimize to have two terms, which correspond to negative log likelihood ( \(\chi^2/2=\|r(x)|^2\), where \(r(x)\) is the vector of residuals at \(x\)) and negative log prior \(q(x)=-\ln P(x)\):

\[ f(x) = \frac{1}{2}\|r(x)\|^2 + q(x) \]
At each iteration \(k\), we expand \(f(x)\) in a Taylor series in \(s=x_{k+1}-x_{k}\):
\[ f(x) \approx m(s) = f(x_k) + g_k^T s + \frac{1}{2}s^T H_k s \]
where
\[ g_k \equiv \left.\frac{\partial f}{\partial x}\right|_{x_k} = J_k^T r_k + \nabla q_k;\quad\quad J_k \equiv \left.\frac{\partial r}{\partial x}\right|_{x_k} \]
\[ H_k = J_k^T J_k + \nabla^2 q_k + B_k \]
Here, \(B_k\) is the SR1 approximation term to the second derivative term:
\[ B_k \approx \sum_i \frac{\partial^2 r^{(i)}_k}{\partial x^2}r^{(i)}_k \]
which we initialize to zero and then update with the following formula:
\[ B_{k+1} = B_{k} + \frac{v v^T}{v^T s};\quad\quad v\equiv J^T_{k+1} r_{k+1} - J^T_k r_k \]
Unlike the more common rank-2 BFGS update formula, SR1 updates are not guaranteed to produce a positive definite Hessian. This can result in more accurate approximations of the Hessian (and hence more accurate covariance matrices), but it rules out line-search methods and the simple dog-leg approach to the trust region problem. As a result, we should require fewer steps to converge, but spend more time computing each step; this is ideal when we expect the time spent in function evaluation to dominate the time per step anyway.

Public Types

enum StateFlags

Values:

CONVERGED_GRADZERO = 0x0001
CONVERGED_TR_SMALL = 0x0002
CONVERGED = CONVERGED_GRADZERO | CONVERGED_TR_SMALL
FAILED_MAX_INNER_ITERATIONS = 0x0010
FAILED_MAX_OUTER_ITERATIONS = 0x0020
FAILED_MAX_ITERATIONS = 0x0030
FAILED_EXCEPTION = 0x0040
FAILED_NAN = 0x0080
FAILED = FAILED_MAX_INNER_ITERATIONS | FAILED_MAX_OUTER_ITERATIONS | FAILED_EXCEPTION | FAILED_NAN
STATUS_STEP_REJECTED = 0x0100
STATUS_STEP_ACCEPTED = 0x0200
STATUS_STEP = STATUS_STEP_REJECTED | STATUS_STEP_ACCEPTED
STATUS_TR_UNCHANGED = 0x1000
STATUS_TR_DECREASED = 0x2000
STATUS_TR_INCREASED = 0x4000
STATUS_TR = STATUS_TR_UNCHANGED | STATUS_TR_DECREASED | STATUS_TR_INCREASED
STATUS = STATUS_STEP | STATUS_TR
typedef OptimizerObjective Objective
typedef OptimizerControl Control
typedef OptimizerHistoryRecorder HistoryRecorder

Public Functions

lsst::meas::modelfit::Optimizer::Optimizer(PTR( Objective const) objective, ndarray::Array< Scalar const, 1, 1 > const & parameters, Control const & ctrl)
PTR(Objective const) const
Control const &getControl() const
bool step()
bool step(HistoryRecorder const &recorder, afw::table::BaseCatalog &history)
int run()
int run(HistoryRecorder const &recorder, afw::table::BaseCatalog &history)
int getState() const
Scalar getObjectiveValue() const
ndarray::Array<Scalar const, 1, 1> getParameters() const
ndarray::Array<Scalar const, 1, 1> getResiduals() const
ndarray::Array<Scalar const, 1, 1> getGradient() const
ndarray::Array<Scalar const, 2, 2> getHessian() const
void removeSR1Term()

Remove the symmetric-rank-1 secant term from the Hessian, making it just (J^T J)