chop.optim¶
Full-gradient optimizers.¶
This module contains full gradient optimizers in PyTorch. These optimizers expect to be called on variables of shape (batch_size, *), and will perform the optimization point-wise over the batch.
This API is inspired by the COPT project https://github.com/openopt/copt.
Functions
|
|
|
Performs the Frank-Wolfe algorithm on a batch of objectives of the form |
|
Performs Projected Gradient Descent on batch of objectives of form: |
|
|
|
Davis-Yin three operator splitting method. |
-
chop.optim.
minimize_frank_wolfe
(closure, x0, lmo, step='sublinear', max_iter=200, callback=None, *args, **kwargs)[source]¶ - Performs the Frank-Wolfe algorithm on a batch of objectives of the form
min_x f(x) s.t. x in C
where we have access to the Linear Minimization Oracle (LMO) of the constraint set C, and the gradient of f through closure.
- Parameters
closure – callable gives function values and the jacobian of f.
x0 – torch.Tensor of shape (batch_size, *). initial guess
lmo – callable Returns update_direction, max_step_size
step – float or ‘sublinear’ step-size scheme to be used.
max_iter – int max number of iterations.
callback – callable (optional) Any callable called on locals() at the end of each iteration. Often used for logging.
-
chop.optim.
minimize_pgd
(closure, x0, prox, step='backtracking', max_iter=200, max_iter_backtracking=1000, backtracking_factor=0.6, tol=1e-08, *prox_args, callback=None)[source]¶ - Performs Projected Gradient Descent on batch of objectives of form:
f(x) + g(x).
We suppose we have access to gradient computation for f through closure, and to the proximal operator of g in prox.
- Parameters
closure – callable
x0 – torch.Tensor of shape (batch_size, *).
prox – callable proximal operator of g
step – ‘backtracking’ or float or torch.tensor of shape (batch_size,) or None. step size to be used. If None, will be estimated at the beginning using line search. If ‘backtracking’, will be estimated at each step using backtracking line search.
max_iter – int number of iterations to perform.
max_iter_backtracking – int max number of iterations in the backtracking line search
backtracking_factor – float factor by which to multiply the step sizes during line search
tol – float stops the algorithm when the certificate is smaller than tol for all datapoints in the batch
prox_args – tuple (optional) additional args for prox
callback – callable (optional) Any callable called on locals() at the end of each iteration. Often used for logging.
-
chop.optim.
minimize_three_split
(closure, x0, prox1=None, prox2=None, tol=1e-06, max_iter=1000, verbose=0, callback=None, line_search=True, step=None, max_iter_backtracking=100, backtracking_factor=0.7, h_Lipschitz=None, *args_prox)[source]¶ Davis-Yin three operator splitting method. This algorithm can solve problems of the form
minimize_x f(x) + g(x) + h(x)
where f is a smooth function and g and h are (possibly non-smooth) functions for which the proximal operator is known.
- Remark: this method returns x = prox1(…). If g and h are two indicator
functions, this method only garantees that x is feasible for the first. Therefore if one of the constraints is a hard constraint, make sure to pass it to prox1.
- Parameters
closure – callable Returns the function values and gradient of the objective function. With return_gradient=False, returns only the function values. Shape of return value: (batch_size, *)
x0 – torch.Tensor(shape: (batch_size, *)) Initial guess
prox1 – callable or None prox1(x, step_size, *args) returns the proximal operator of g at xa with parameter step_size. step_size can be a scalar or of shape (batch_size,).
prox2 – callable or None prox2(x, step_size, *args) returns the proximal operator of g at xa with parameter step_size. alpha can be a scalar or of shape (batch_size,).
tol – float Tolerance of the stopping criterion.
max_iter – int Maximum number of iterations.
verbose – int Verbosity level, from 0 (no output) to 2 (output on each iteration)
callback – callable. callback function (optional). Called with locals() at each step of the algorithm. The algorithm will exit if callback returns False.
line_search – boolean Whether to perform line-search to estimate the step sizes.
step_size – float or tensor(shape: (batch_size,)) or None Starting value(s) for the line-search procedure. if None, step_size will be estimated for each datapoint in the batch.
max_iter_backtracking – int maximun number of backtracking iterations. Used in line search.
backtracking_factor – float the amount to backtrack by during line search.
args_prox – iterable (optional) Extra arguments passed to the prox functions.
kwargs_prox – dict (optional) Extra keyword arguments passed to the prox functions.
- Returns
- OptimizeResult
The optimization result represented as a
scipy.optimize.OptimizeResult
object. Important attributes are:x
the solution tensor,success
a Boolean flag indicating if the optimizer exited successfully andmessage
which describes the cause of the termination. See scipy.optimize.OptimizeResult for a description of other attributes.
- Return type
res