Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm

This topic contains the following sections:

Specification
Implementation and Usage
Code Sample
See Also

BFGS is Quasi-Newton second-derivative line search family method, one of the most powerful methods to solve unconstrained optimization problem.

Specification

As any of Newton-like methods, BFGS uses quadratic Taylor approximation of the objective function in a d-vicinity of x:

f(x + d) ≈ q(d) = f(x) + d^Tg(x) + ½ d^TH(x) d,

where g(x) is the gradient vector and H(x) is the Hessian matrix.

The necessary condition for a local minimum of q(d) with respect to d results in the linear system:

g(x)+ H(x) d = 0

which, in turn, gives the Newton direction d for line search:

d = - H(x)^-1g(x))

The exact Newton direction (which is subject to define in Newton-type methods) is reliable when

The Hessian matrix exists and positive definite;
The difference between the true objective function and its quadratic approximation is not large.

In Quasi–Newton methods, the idea is to use matrices which approximate the Hessian matrix and/or its inverse, instead of exact computing of the Hessian matrix (as in Newton-type methods). The matrices are normally named B ≈ H and D ≈ H^-1.

The matrices are adjusted on each iteration and can be produced in many different ways ranging from very simple techniques to highly advanced schemes.

The BFGS method uses the BFGS updating formula which converges to H (x^*):

where

s_k = x_k+1 - x_k,
y_k = g_k+1 - g_k.

As a starting point, B₀ can be set to any symmetric positive definite matrix, for example and very often, the identity matrix.

The BFGS method exposes superlinear convergence; resource-intensivity is estimated as O(n²) per iteration for n-component argument vector.

Implementation and Usage

The BFGS algorithm is implemented by the BFGS class. The implementation realizes two-stage iterative process according to general line search strategy:

'Outer' iterations define BFGS-based line search direction;
'Inner' iterative process solves the step length subproblem using either the default (Strong Wolfe) approach or user-specific implementation.

To initialize an instance of the class use one of the two constructors provided, both the constructors require objective function and its gradient to be presented as delegate methods:

The first constructor provides the default Wolfe algorithm to compute step length on each iteration, while the second assumes user-specific line search method in form of LineSearch interface. The second constructor is not recommended to use unless the user has very specific task to solve and enough skills to implement a specific line search method as class.

The BFGS optimizer is controlled by stopping criteria and one more parameter as class properties that have default values assigned while initialization and should be customized (if necessary) before running of optimization process; the default values are suitable in most practical applications:

Property	Default Value	Restriction	Description
ArgumentTolerance	1E-8	0.0 .. 0.01	Termination tolerance on argument change (lower bound): the process is finished when a regular iteration results in lower change of the argument.
FunctionTolerance	1E-8	0.0 .. 0.01	Termination tolerance on the function value change (lower bound): the process is terminated when a regular iteration results in lower decrease of the objective function.
IterationsLimit	10000	>1	The maximal number of iterations.
HessianResetPeriod	MaxValue	>1	The period to reset Hessian to identity matrix: this may be needed after a large number of iterations either to preserve Hessian to be positive definite, or to avoid accumulation of errors, or to ensure the optimum is found.

The following class properties present optimization results:

Property	Description
MinimumPoint	Contains the optimal solution.
IterationsDone	The number of iterations done before algorithm termination.

Code Sample

The sample demonstrates BFGS-minimization of Rosenbrock function:

Gradient of this function is:

Both function and its gradient should be presented as delegate methods.

 2                                                              

ContainerTabSingle">C#

Copy

 1using System; class="highlight-keyword">using FinMath.LinearAlgebra; class="highlight-keyword">using FinMath.NumericalOptimization.Unconstrained; class="highlight-keyword">namespace FinMath.Samples { class BFGSOptimization { static void Main() { //Domain dimension (the number of arguments of the objective function). Int32 n = 3; // Initialize BFGS optimizer for Rosenbrock function. BFGS bfgs = new BFGS(n, RosenbrockFunction, RosenbrockGradient); // Define start point. Vector startPoint = Vector.Random(n); Console.WriteLine($"Start Point = {startPoint.ToString("0.000000")}"); // Set up stopping criteria (if necessary). bfgs.ArgumentTolerance = 1.0E-9; bfgs.FunctionTolerance = 1.0E-9; bfgs.IterationsLimit = 1000; // Run the optimization. bfgs.Minimize(startPoint); // Check computation status and get results. if (bfgs.Status == FMStatus.MethodSucceeded) { Console.WriteLine($"Minimum Point = {bfgs.MinimumPoint.ToString("0.000000")}"); Console.WriteLine($"Objective Function = {RosenbrockFunction(bfgs.MinimumPoint):0.000000}"); Console.WriteLine(); Console.WriteLine($"Iterations Done = {bfgs.IterationsDone}"); } else { Console.WriteLine($"BFGS failed with status {bfgs.Status}"); } } private static Double RosenbrockFunction(Vector x) { Int32 n = x.Count; Double y = 0.0; for (Int32 i = 0; i < n - 1; i += 1) { y += Math.Pow((1.0 - x[i]), 2.0) + 100.0 * Math.Pow((x[i + 1] - Math.Pow(x[i], 2.0)), 2.0); } return y; } private static void RosenbrockGradient(Vector x, Vector g) { Int32 n = x.Count; g[0] = -2.0 * (1.0 - x[0]) - 400.0 * (x[1] - Math.Pow(x[0], 2.0)) * x[0]; g[n - 1] = 200.0 * (x[n - 1] - Math.Pow(x[n - 2], 2.0)); for (Int32 i = 1; i < n - 1; i += 1) { g[i] = -2.0 * (1.0 - x[i]) - 400.0 * (x[i + 1] - Math.Pow(x[i], 2.0)) * x[i] + 200.0 * (x[i] - Math.Pow(x[i - 1], 2.0)); } } } }

Reference

BFGS

QuasiNewton

LineSearch

FinMath.LinearAlgebra

Other Resources

Basic Terms and Concepts