NAME
optimize.h - numerical optimization routines
SYNOPSIS
This module provides a generic and uniform interface to optimization routines that can be used on a wide variety of problems. This package currently includes multi-dimensional search algorithms (steepest descent, conjugate gradient, and quasi-Newton) and line search routines (cubic interpolation, golden section, and a hybrid of the first two).
#include <optimize.h>
double opt_eval_func(OPTIMIZER *opt, double *weights);
double opt_eval_grad(OPTIMIZER *opt, double *weights);
double opt_lnsrch_cubic(OPTIMIZER *opt,
double *dir,
double stepsz);
double opt_lnsrch_golden(OPTIMIZER *opt,
double *dir,
double stepsz);
double opt_lnsrch_hybrid(OPTIMIZER *opt,
double *dir,
double stepsz);
void opt_levenberg_marquardt(OPTIMIZER *opt,
int state);
void opt_conjgrad_pr(OPTIMIZER *opt, int state);
void opt_conjgrad_fr(OPTIMIZER *opt, int state);
void opt_quasinewton_dfp(OPTIMIZER *opt, int state);
void opt_quasinewton_bfgs(OPTIMIZER *opt, int state);
void opt_gradient_descent(OPTIMIZER *opt, int state);
DESCRIPTION
With the suplied optimization routines, you can perform numerical optimization that's stops under a wide variety of user specified conditions, performs regular auxiliary functions through a hook mechanism, and computes a few useful statistics.
The source file graddesc.c gives a simple example of how an optimization routine should be written for this package. You can use that file as a template for creating additional optimization routines.
Type Declarations
The following types are defined in the header file optimize.h.
OPTIMIZER
typedef struct OPTIMIZER {
/*
* These fields *must* be specified for an optimization
* routine. They essentially specify the function that
* is to be optimized what method for optimization to use.
*
* The fields below contain: the size of the weight space,
* the function to minimize, a function to compute the
* gradient, pointers to get and set the weights and
* gradients without evaluating the function or the
* gradients, and the optimization engine to use.
*/
unsigned size;
double (*funcf)(void *obj);
double (*gradf)(void *obj);
double **weights, **grads;
void (*engine)(struct OPTIMIZER *opt, int state);
/*
* Optional fields. These allow you perform regular
* events and constrained optimization.
*
* An object to pass to some functions, a function to
* call on each epoch, a hook that is called when
* (epoch % hook_freq == 0), a function to check for
* special halting conditions, and an alternate line
* search algorithm.
*/
void *obj;
int (*hook)(void *obj);
unsigned hook_freq;
int (*haltf)(void *obj);
double (*stepf)(struct OPTIMIZER *opt, double *dir,
double stepsz);
/*
* These fields specify stopping criterion:
* the minimum number of iterations, the maximum number
* of iterations, stop if error falls below error_tol,
* stop if delta error falls below delta_error_tol.
*/
unsigned min_epochs, max_epochs;
double error_tol, delta_error_tol;
/*
* The fields below are for stochastic optimization
* routines. If the stochastic field is non-zero, then
* the decayed halting criteria will also be checked by
* the optimization procedures. The value of decay
* specifies how rapidly the decayed statistics age. As
* an example, given an instantaneous error, the new
* value of the decayed error is set to (old * decay +
* (1 - decay) * error). The seed field is for internal
* use only as it allows the routines to compute
* consistent function evaluations from one step to the
* next.
*/
double decayed_error_tol, decayed_delta_error_tol, decay;
unsigned stochastic, seed;
/*
* Information for online techniques: fixed step size
* and the momentum.
*/
double rate, momentum;
/*
* Weight decay term which augments the error
* function with a sum of the squared weights times
* wdecay. The gradient function is also modified
* accordingly.
*/
double wdecay;
/*
* These are statistics filled in by the optimization
* routines: the number of calls made to funcf(), the
* number of calls made to gradf(), the current epoch,
* error and delta erors (dacayed and not decayed),
* the magnitude of the gradient vector, and the last
* step size used by a line search routine.
*/
unsigned fcalls, gcalls, epoch;
double error, delta_error;
double decayed_error, decayed_delta_error;
double gradmag, stepsz;
/*
* Since the optimization engine and the line search
* routines can halt for a variety of reasons, the
* last results are stored in the feilds below.
*/
unsigned stepf_result, engine_result;
/*
* The next five auxiliary fields are for use by
* user-defined engines and line search routines
* that may need additional parameters that don't
* quite fit anywhere in this structure.
*/
double aux1, aux2, aux3, aux4, aux5;
/*
* This is normally set to the pointer of the structure
* that "owns" this optimizer.
*/
void *owner;
/*
* This is an internal data structure that the optimization
* routines use to hold the progress of the optimization.
*/
void *internal;
} OPTIMIZER;
This is the structure that must be filled in by the user so as to specify all of the details as to how the optimization routines should work. Each optimization routines takes a pointer to this structure as a single argument.
Function Definitions
The following function prototypes are given in the header file optimize.h.
double opt_eval_func(OPTIMIZER *opt, double *weights);
Evaluates the objective function, opt->funcf(opt->obj), and sets opt->val, opt->goodfval, and opt->fcalls appropriately. The last evaluated value of the objective function is returned. If weights is non-NULL, then its values are copied on top of opt->weights prior to doing anything.
double opt_eval_grad(OPTIMIZER *opt, double *weights);
Evaluates the gradient function, opt->funcg(opt->obj), and sets opt->val, opt->goodfval, opt->goodgval, and opt->gcalls appropriately. The last evaluated value of the objective function is returned. If weights is non-NULL, then its values are copied on top of opt->weights prior to doing anything.
Optimizes opt by calling opt->engine. All calculations for statistics and checking halting conditions is done here.
double opt_lnsrch_cubic(OPTIMIZER *opt,
double *dir,
double stepsz);
Line search with cubic interpolation. The point defined by opt->weights is searched in the direction of dir with the initial guess for a step size contained in stepsz. If stepsz is zero, then the routine will make its own guess. The weights are changed to opt->weights + alpha * dir where alpha is the optimimal step size (which is also the value that is returned by the function). This routine assumes that the current gradient values are valid for the current weights.
double opt_lnsrch_golden(OPTIMIZER *opt,
double *dir,
double stepsz);
Line search via the golden section algorithm. The point defined by opt->weights is searched in the direction of dir with the initial guess for a step size contained in stepsz. If stepsz is zero, then the routine will make its own guess. The weights are changed to opt->weights + alpha * dir where alpha is the optimimal step size (which is also the value that is returned by the function). This routine assumes that the current gradient values are valid for the current weights.
double opt_lnsrch_hybrid(OPTIMIZER *opt,
double *dir,
double stepsz);
Line search via a relatively inaccurate hyrbrid algorithm. The point defined by opt->weights is searched in the direction of dir with the initial guess for a step size contained in stepsz. If stepsz is zero, then the routine will make its own guess. The weights are changed to opt->weights + alpha * dir where alpha is the optimimal step size (which is also the value that is returned by the function). The algorithm performs a simple but thorough search to bracket the minimum then fits a parabola to the last three locations of the search space. If the bottom of the parabola evalutes to a lower value, then it is accepted as the final value; otherwise, the bast value from the bracketing procedure is used. This routine assumes that the current gradient values are valid for the current weights.
void opt_levenberg_marquardt(OPTIMIZER *opt,
int state);
Levenberg-Marquardt. This routine is currently broken.
void opt_conjgrad_pr(OPTIMIZER *opt, int state);
Polak-Ribiere deterministic conjugate gradient.
void opt_conjgrad_fr(OPTIMIZER *opt, int state);
Fletcher-Reeves deterministic conjugate gradient.
void opt_quasinewton_dfp(OPTIMIZER *opt, int state);
Davidson-Fletcher-Powell quasi-Newton.
void opt_quasinewton_bfgs(OPTIMIZER *opt, int state);
Broyden-Fletcher-Goldfarb-Shanno quasi-Newton.
void opt_gradient_descent(OPTIMIZER *opt, int state);
Batched gradient descent with momentum.
AUTHOR
Gary William Flake (gary.flake@usa.net).