chop.optim

Full-gradient optimizers.

This module contains full gradient optimizers in PyTorch. These optimizers expect to be called on variables of shape (batch_size, *), and will perform the optimization point-wise over the batch.

This API is inspired by the COPT project https://github.com/openopt/copt.

Functions

backtracking_pgd(closure, prox, step_size, …)

minimize_frank_wolfe(closure, x0, lmo[, …])

Performs the Frank-Wolfe algorithm on a batch of objectives of the form

minimize_pgd(closure, x0, prox[, step, …])

Performs Projected Gradient Descent on batch of objectives of form:

minimize_pgd_madry(closure, x0, prox, lmo[, …])

minimize_three_split(closure, x0[, prox1, …])

Davis-Yin three operator splitting method.

chop.optim.minimize_frank_wolfe(closure, x0, lmo, step='sublinear', max_iter=200, callback=None, *args, **kwargs)[source]
Performs the Frank-Wolfe algorithm on a batch of objectives of the form

min_x f(x) s.t. x in C

where we have access to the Linear Minimization Oracle (LMO) of the constraint set C, and the gradient of f through closure.

Parameters
  • closure – callable gives function values and the jacobian of f.

  • x0 – torch.Tensor of shape (batch_size, *). initial guess

  • lmo – callable Returns update_direction, max_step_size

  • step – float or ‘sublinear’ step-size scheme to be used.

  • max_iter – int max number of iterations.

  • callback – callable (optional) Any callable called on locals() at the end of each iteration. Often used for logging.

chop.optim.minimize_pgd(closure, x0, prox, step='backtracking', max_iter=200, max_iter_backtracking=1000, backtracking_factor=0.6, tol=1e-08, *prox_args, callback=None)[source]
Performs Projected Gradient Descent on batch of objectives of form:

f(x) + g(x).

We suppose we have access to gradient computation for f through closure, and to the proximal operator of g in prox.

Parameters
  • closure – callable

  • x0 – torch.Tensor of shape (batch_size, *).

  • prox – callable proximal operator of g

  • step – ‘backtracking’ or float or torch.tensor of shape (batch_size,) or None. step size to be used. If None, will be estimated at the beginning using line search. If ‘backtracking’, will be estimated at each step using backtracking line search.

  • max_iter – int number of iterations to perform.

  • max_iter_backtracking – int max number of iterations in the backtracking line search

  • backtracking_factor – float factor by which to multiply the step sizes during line search

  • tol – float stops the algorithm when the certificate is smaller than tol for all datapoints in the batch

  • prox_args – tuple (optional) additional args for prox

  • callback – callable (optional) Any callable called on locals() at the end of each iteration. Often used for logging.

chop.optim.minimize_three_split(closure, x0, prox1=None, prox2=None, tol=1e-06, max_iter=1000, verbose=0, callback=None, line_search=True, step=None, max_iter_backtracking=100, backtracking_factor=0.7, h_Lipschitz=None, *args_prox)[source]

Davis-Yin three operator splitting method. This algorithm can solve problems of the form

minimize_x f(x) + g(x) + h(x)

where f is a smooth function and g and h are (possibly non-smooth) functions for which the proximal operator is known.

Remark: this method returns x = prox1(…). If g and h are two indicator

functions, this method only garantees that x is feasible for the first. Therefore if one of the constraints is a hard constraint, make sure to pass it to prox1.

Parameters
  • closure – callable Returns the function values and gradient of the objective function. With return_gradient=False, returns only the function values. Shape of return value: (batch_size, *)

  • x0 – torch.Tensor(shape: (batch_size, *)) Initial guess

  • prox1 – callable or None prox1(x, step_size, *args) returns the proximal operator of g at xa with parameter step_size. step_size can be a scalar or of shape (batch_size,).

  • prox2 – callable or None prox2(x, step_size, *args) returns the proximal operator of g at xa with parameter step_size. alpha can be a scalar or of shape (batch_size,).

  • tol – float Tolerance of the stopping criterion.

  • max_iter – int Maximum number of iterations.

  • verbose – int Verbosity level, from 0 (no output) to 2 (output on each iteration)

  • callback – callable. callback function (optional). Called with locals() at each step of the algorithm. The algorithm will exit if callback returns False.

  • line_search – boolean Whether to perform line-search to estimate the step sizes.

  • step_size – float or tensor(shape: (batch_size,)) or None Starting value(s) for the line-search procedure. if None, step_size will be estimated for each datapoint in the batch.

  • max_iter_backtracking – int maximun number of backtracking iterations. Used in line search.

  • backtracking_factor – float the amount to backtrack by during line search.

  • args_prox – iterable (optional) Extra arguments passed to the prox functions.

  • kwargs_prox – dict (optional) Extra keyword arguments passed to the prox functions.

Returns

OptimizeResult

The optimization result represented as a scipy.optimize.OptimizeResult object. Important attributes are: x the solution tensor, success a Boolean flag indicating if the optimizer exited successfully and message which describes the cause of the termination. See scipy.optimize.OptimizeResult for a description of other attributes.

Return type

res