Class Optimizer¶
Defined in File optimizer.h
Class Documentation¶
-
class
Optimizer
¶ A numerical optimizer customized for least-squares problems with Bayesian priors.
The algorithm used by Optimizer combines the Gauss-Newton approach of approximating the second-derivative (Hessian) matrix as the inner product of the Jacobian of the residuals, while maintaining a matrix of corrections to this to account for large residuals, which is updated using a symmetric rank-1 (SR1) secant formula. We assume the prior has analytic first and second derivatives, but use numerical derivatives to compute the Jacobian of the residuals at every step. A trust region approach is used to ensure global convergence.
We consider the function \(f(x)\) we wish to optimize to have two terms, which correspond to negative log likelihood ( \(\chi^2/2=\|r(x)|^2\), where \(r(x)\) is the vector of residuals at \(x\)) and negative log prior \(q(x)=-\ln P(x)\):
\[ f(x) = \frac{1}{2}\|r(x)\|^2 + q(x) \]At each iteration \(k\), we expand \(f(x)\) in a Taylor series in \(s=x_{k+1}-x_{k}\):\[ f(x) \approx m(s) = f(x_k) + g_k^T s + \frac{1}{2}s^T H_k s \]where\[ g_k \equiv \left.\frac{\partial f}{\partial x}\right|_{x_k} = J_k^T r_k + \nabla q_k;\quad\quad J_k \equiv \left.\frac{\partial r}{\partial x}\right|_{x_k} \]\[ H_k = J_k^T J_k + \nabla^2 q_k + B_k \]Here, \(B_k\) is the SR1 approximation term to the second derivative term:\[ B_k \approx \sum_i \frac{\partial^2 r^{(i)}_k}{\partial x^2}r^{(i)}_k \]which we initialize to zero and then update with the following formula:\[ B_{k+1} = B_{k} + \frac{v v^T}{v^T s};\quad\quad v\equiv J^T_{k+1} r_{k+1} - J^T_k r_k \]Unlike the more common rank-2 BFGS update formula, SR1 updates are not guaranteed to produce a positive definite Hessian. This can result in more accurate approximations of the Hessian (and hence more accurate covariance matrices), but it rules out line-search methods and the simple dog-leg approach to the trust region problem. As a result, we should require fewer steps to converge, but spend more time computing each step; this is ideal when we expect the time spent in function evaluation to dominate the time per step anyway.Public Types
-
enum
StateFlags
¶ Values:
-
CONVERGED_GRADZERO
= 0x0001¶
-
CONVERGED_TR_SMALL
= 0x0002¶
-
CONVERGED
= CONVERGED_GRADZERO | CONVERGED_TR_SMALL¶
-
FAILED_MAX_INNER_ITERATIONS
= 0x0010¶
-
FAILED_MAX_OUTER_ITERATIONS
= 0x0020¶
-
FAILED_MAX_ITERATIONS
= 0x0030¶
-
FAILED_EXCEPTION
= 0x0040¶
-
FAILED_NAN
= 0x0080¶
-
STATUS_STEP_REJECTED
= 0x0100¶
-
STATUS_STEP_ACCEPTED
= 0x0200¶
-
STATUS_STEP
= STATUS_STEP_REJECTED | STATUS_STEP_ACCEPTED¶
-
STATUS_TR_UNCHANGED
= 0x1000¶
-
STATUS_TR_DECREASED
= 0x2000¶
-
STATUS_TR_INCREASED
= 0x4000¶
-
STATUS_TR
= STATUS_TR_UNCHANGED | STATUS_TR_DECREASED | STATUS_TR_INCREASED¶
-
STATUS
= STATUS_STEP | STATUS_TR¶
-
-
typedef OptimizerObjective
Objective
¶
-
typedef OptimizerControl
Control
¶
-
typedef OptimizerHistoryRecorder
HistoryRecorder
¶
Public Functions
-
lsst::meas::modelfit::Optimizer::Optimizer(PTR( Objective const) objective, ndarray::Array< Scalar const, 1, 1 > const & parameters, Control const & ctrl)
-
bool
step
()¶
-
bool
step
(HistoryRecorder const &recorder, afw::table::BaseCatalog &history)¶
-
int
run
()¶
-
int
run
(HistoryRecorder const &recorder, afw::table::BaseCatalog &history)¶
-
int
getState
() const¶
-
void
removeSR1Term
()¶ Remove the symmetric-rank-1 secant term from the Hessian, making it just (J^T J)
-
enum