Optimization Algorithms (Opti)
This section contains optimization algorithms classes which all derive from the abstract class Opti
.
OptiADMM
- class Opti.OptiADMM
Bases:
Opti
Alternating Direction Method of Multipliers [1] algorithm which minimizes
Cost
of the form $$ C(\mathrm{x}) = F_0(\mathrm{x}) + \sum_{n=1}^N F_n(\mathrm{H_n x}) $$- Parameters:
F_0 –
Cost
objectF_n – cell of N
Cost
with an implementation of theprox()
for each oneH_n – cell of N
LinOp
rho_n – array of N positive scalars
maxiterCG – maximal number of inner conjugate gradient (CG) iterations (when required, default 20)
solver – a handle function taking parameters solver(z_n,rho_n,x0) (see the note below)
OutOpCG –
OutputOpti
object for CG (when used)ItUpOutCG –
ItUpOut
parameter for CG (when used, default 0)
All attributes of parent class
Opti
are inherited.Principle The algorithm aims at minimizing the Lagrangian formulation of the above problem: $$ \mathcal{L}(\mathrm{x,y_1…y_n,w_1…w_n}) = F_0(\mathrm{x}) + \sum_{n=1}^N \frac12\rho_n\Vert \mathrm{H_nx - y_n + w_n/\rho_n} \Vert^2 + F_n(\mathrm{y_n})$$ using an alternating minimization scheme [1].
Note The minimization of \(\mathcal{L}\) over \(\mathrm{x}\), $$ F_0(\mathrm{x}) + \sum_{n=1}^N \frac12\rho_n\Vert \mathrm{H_nx -z_n}\Vert^2, \quad \mathrm{z_n= y_n - w_n/\rho_n} $$ is performed either using the conjugate-gradient
OptiConjGrad
algorithm, a direct inversion or the given solverIf \(F_0\) is empty or is a
CostL2
, then if theLinOp
\(\sum_{n=0}^N \mathrm{H_n}^*\mathrm{H_n}\) is not invertible theOptiConjGrad
is used by default if no more efficient solver is provided. Here \(\mathrm{H_0}\) is theLinOp
associated to \(F_0\).Otherwise the solver is required.
Reference
[1] Boyd, Stephen, et al. “Distributed optimization and statistical learning via the alternating direction method of multipliers.” Foundations and Trends in Machine Learning, 2011.
Example ADMM=OptiADMM(F0,Fn,Hn,rho_n,solver)
See also
Opti
,OptiConjGrad
OutputOpti
,Cost
OptiChambPock
- class Opti.OptiChambPock
Bases:
Opti
Chambolle-Pock optimization algorithm [1] which minimizes
Cost
of the form $$ C(\mathrm{x}) = F(\mathrm{Hx}) + G(\mathrm{x}) $$- Parameters:
F – a
Cost
with an implementation of theprox()
.G – a
Cost
with an implementation of theprox()
.H – a
LinOp
.tau – parameter of the algorithm (default 1)
sig – parameter of the algorithm which is computed automatically if H.norm is different from -1.
var –
select the “bar” variable of the algorithm (see [1]):
if 1 (default) then the primal variable \(\bar{\mathrm{x}} = 2\mathrm{x}_n - \mathrm{x}_{n-1}\) is used
if 2 then the dual variable \(\bar{\mathrm{y}} = 2\mathrm{y}_n - \mathrm{y}_{n-1} \) is used
gam –
if non-empty, accelerated version (see [1]). Here, \(G\) or \(F^*\) is uniformly convex with \(\nabla G^*\) or \(\nabla F\) 1/gam-Lipschitz:
If \(G\) is uniformly convex then set the parameter var to 1.
If \(F^*\) is uniformly convex then set the parameter var to 2
All attributes of parent class
Opti
are inherited.Note-1: In fact, \(F\) needs only the prox of its fenchel transform
prox_fench()
(which is implemented as soon as $F$ has an implementation of the prox, seeCost
).Note-2:
To ensure convergence (see [1]), parameters sig and tau have to verify $$ \sigma \times \tau \times \Vert \mathrm{H} \Vert^2 < 1 $$ where \(\Vert \mathrm{H}\Vert\) denotes the norm of the linear operator H.
When the accelerated version is used (i.e. parameter gam is non-empty), sig and tau will be updated at each iteration and the initial ones (given by user) have to verify $$ \sigma \times \tau \times \Vert \mathrm{H} \Vert^2 \leq 1 $$
Reference:
[1] Chambolle, Antonin, and Thomas Pock. “A first-order primal-dual algorithm for convex problems with applications to imaging.” Journal of Mathematical Imaging and Vision 40.1, pp 120-145 (2011).
Example CP=OptiChambPock(F,H,G)
See also
Opti
OutputOpti
Cost
OptiConjGrad
- class Opti.OptiConjGrad
Bases:
Opti
Conjugate gradient optimization algorithm which solves the linear system \(\mathrm{Ax=b}\) by minimizing $$ C(\mathrm{x})= \frac12 \mathrm{x^TAx - b^Tx} $$
- Parameters:
A – symmetric definite positive
LinOp
b – right-hand term
All attributes of parent class
Opti
are inherited.Example CG=OptiConjGrad(A,b,OutOp)
See also
Opti
,OutputOpti
Cost
OptiDouglasRachford
- class Opti.OptiDouglasRachford
Bases:
Opti
Douglas Rachford splitting optimization algorithm which minimizes
Cost
of the form $$ C(\mathrm{x}) = F_1(\mathrm{x}) + F_2(\mathrm{L x}) $$- Parameters:
All attributes of parent class
Opti
are inherited.Example DR=OptiDouglasRachford(F1, F2, L, gamma, lambda)
See also
Opti
,OutputOpti
,Cost
OptiFBS
- class Opti.OptiFBS
Bases:
Opti
Forward-Backward Splitting optimization algorithm [1] which minimizes
Cost
of the form $$ C(\mathrm{x}) = F(\mathrm{x}) + G(\mathrm{x}) $$- Parameters:
F – a differentiable
Cost
(i.e. with an implementation ofapplyGrad()
).G – a
Cost
with an implementation of theapplyProx()
.gam – descent step
fista – boolean true if the accelerated version FISTA [3] is used (default false)
momRestart – boolean true if the moment restart strategy is used [4](default false)
updateGam – Rule for updating gamma (none : default, reduced : the parameter gam is decreased according to \(\gamma / \sqrt{k} \), backtracking : backtracking rule following [3])
eta – parameter greater than 1 that is used with backtracking (see [3])
All attributes of parent class
Opti
are inherited.Note: When the functional are convex and \(F\) has a Lipschitz continuous gradient, convergence is ensured by taking \(\gamma \in (0,2/L] \) where \(L\) is the Lipschitz constant of \(\nabla F\) (see [1]). When FISTA is used [3], \(\gamma \) should be in \((0,1/L]\). For nonconvex functions [2] take \(\gamma \in (0,1/L]\). If \(L\) is known (i.e. F.lip different from -1), parameter \(\gamma\) is automatically set to \(1/L\).
References:
[1] P.L. Combettes and V.R. Wajs, “Signal recovery by proximal forward-backward splitting”, SIAM Journal on Multiscale Modeling & Simulation, vol 4, no. 4, pp 1168-1200, (2005).
[2] Hedy Attouch, Jerome Bolte and Benar Fux Svaiter “Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized gaussiedel methods.” Mathematical Programming, 137 (2013).
[3] Amir Beck and Marc Teboulle, “A Fast Iterative Shrinkage-Thresholding Algorithm for Linear inverse Problems”, SIAM Journal on Imaging Science, vol 2, no. 1, pp 182-202 (2009)
[4] Brendan O’donoghue and Emmanuel Candès. 2015. Adaptive Restart for Accelerated Gradient Schemes. Found. Comput. Math. 15, 3 (June 2015), 715-732.
Example FBS=OptiFBS(F,G)
See also
Opti
OutputOpti
Cost
OptiFGP
- class Opti.OptiFGP
Bases:
Opti
Fast Gradient Proximal which computes the TV proximity operator which minimizes
Cost
of the form $$ C(\mathrm{x}) = \frac12\|\mathrm{x} - \mathrm{y}\|^2_2 + \lambda \|\mathrm{x} \|_{TV} $$- Parameters:
F_0 –
CostL2
objectTV –
CostTV
bounds – bounds for set constraint
gam – descent step (default 1/8)
lambda – regularization parameter for TV
All attributes of parent class
Opti
are inherited.References
[1] Beck, A., and Teboulle, M. (2009). Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Transactions on Image Processing, 18(11), 2419-2434.
Example FGP=OptiFGP(F0,TV,bounds)
See also
Opti
,OutputOpti
Cost
OptiGradDsct
- class Opti.OptiGradDsct
Bases:
Opti
Gradient Descent optimization algorithm to minimize a differentiable
Cost
\(C(\mathrm{x})\)- Parameters:
C – a differentiable
Cost
(i.e. with an implementation ofapplyGrad()
).gam – descent step
nagd – boolean (default false) to activate the Nesterov accelerated gradient descent
All attributes of parent class
Opti
are inherited.Note If the cost \(C\) is gradient Lipschitz, convergence is ensured by taking \(\gamma \in (0,2/L] \) where \(L\) is the Lipschitz constant of \(\nabla C\) (see [1]). The optimal choice is \(\gamma = 1/L \) (see [1]). If \(L\) is known (i.e. F.lip different from -1), parameter \(\gamma\) is automatically set to \(1/L\).
Reference
[1] Nesterov, Yurii. “Introductory lectures on convex programming.” Lecture Notes (1998): 119-120.
Example GD=OptiGradDsct(F)
See also
Opti
OutputOpti
Cost
OptiPrimalDualCondat
- class Opti.OptiPrimalDualCondat
Bases:
Opti
Primal-Dual algorithm proposed by L. Condat in [1] which minimizes
Cost
of the form $$ C(\mathrm{x})= F_0(\mathrm{x}) + G(\mathrm{x}) + \sum_n F_n(\mathrm{H_nx}) $$- Parameters:
F_0 – a differentiable
Cost
(i.e. with an implementation ofgrad()
).G – a
Cost
with an implementation of theprox()
.F_n – cell of N
Cost
with an implementation of theprox()
for each oneH_n – cell of N
LinOp
tau – parameter of the algorithm (see the note below)
sig – parameter of the algorithm (see the note below)
rho – parameter of the algorithm (see the note below)
All attributes of parent class
Opti
are inherited.Note:
When \(F_0=0\), parameters sig and tau have to verify $$ \sigma \times \tau \Vert \sum_n \mathrm{H_n^*H_n} \Vert \leq 1 $$ and \(\rho \in ]0,2[\), to ensure convergence (see [1, Theorem 5.3]).
Otherwise, when \(F_0\neq 0\), parameters sig and tau have to verify $$ \frac{1}{\tau} - \sigma \times \Vert \sum_n \mathrm{H_n^*H_n} \Vert \geq \frac{\beta}{2} $$ where \(\beta\) is the Lipschitz constant of \(\nabla F\) and we need \(\rho \in ]0,\delta[ \) with $$ \delta = 2 - \frac{\beta}{2}\times\left(\frac{1}{\tau} - \sigma \times \Vert \sum_n \mathrm{H_n^*H_n} \Vert\right)^{-1} \in [1,2[ $$ to ensure convergence (see [1, Theorem 5.1]).
Reference
[1] Laurent Condat, “A Primal-Dual Splitting Method for Convex Optimization Involving Lipchitzian, Proximable and Linear Composite Terms”, Journal of Optimization Theory and Applications, vol 158, no 2, pp 460-479 (2013).
Example A=OptiPrimalDualCondat(F0,G,Fn,Hn)
See also
Opti
,OutputOpti
,Cost
OptiRichLucy
- class Opti.OptiRichLucy
Bases:
Opti
Richardson-Lucy algorithm [1,2] which minimizes the KullbackLeibler divergence
CostKullLeib
(with TV regularization [3]). $$ C(\mathrm{x})= F(\mathrm{x}) + \lambda \Vert \mathrm{x} \Vert_{\mathrm{TV}} $$- Parameters:
F –
CostKullLeib
object or aCostComposition
with aCostKullLeib
and aLinOp
TV – boolean true if TV regularization used (default false)
lambda – regularization parameter (when TV used)
epsl – smoothing parameter to make TV differentiable at 0 (default \(10^{-6}\))
All attributes of parent class
Opti
are inherited.Note An interesting property of this algorithm is that it ensures the positivity of the solution from any positive initialization. However, when TV is used, the positivity of the iterates is not ensured anymore if \(\lambda \) is too large. Hence, \(\lambda \) needs to be carefully chosen.
References
[1] Lucy, Leon B. “An iterative technique for the rectification of observed distributions” The astronomical journal (1974)
[2] Richardson, William Hadley. “Bayesian-based iterative method of image restoration.” JOSA (1972): 55-59.
[3] N. Dey et al. “Richardson-Lucy Algorithm With Total Variation Regularization for 3D Confocal Microscope Deconvolution.” Microscopy research and technique (2006).
Example RL=OptiRichLucy(F,TV,lamb)
See also
Opti
,OutputOpti
,Cost
,CostKullLeib
OptiVMLMB
- class Opti.OptiVMLMB
Bases:
Opti
Variable Metric Limited Memory Bounded (VMLMB) from OptimPackLegacy [1]. This algorithm minimizes a cost \(C(\mathrm{x})\) which is differentiable with bound constraints and/or preconditioning.
- Parameters:
C – minimized cost
xmin – min bound (optional)
xmax – max bound (optional)
All attributes of parent class
Opti
are inherited.Note This Optimizer has many other variables that are set by default to reasonable values. See the function m_vmlmb_first.m in the MatlabOptimPack folder for more details.
Reference
[1] Eric Thiebaut, “Optimization issues in blind deconvolution algorithms”, SPIE Conf. Astronomical Data Analysis II, 4847, 174-183 (2002). See OptimPackLegacy repository.
Example VMLMB=OptiVMLMB(C,xmin,xmax)
See also
Opti
,OptiConjGrad
OutputOpti
,Cost
OutputOpti
OutputOpti (Default)
- class Opti.OutputOpti.OutputOpti
Bases:
matlab.mixin.Copyable
OutputOpti class for algorithms displayings and savings
At each
ItUpOut
iterations of an optimization algorithm (seeOpti
generic class), the update method of anOutputOpti
object will be executed in order to acheive user defined computations, e.g.,compute cost / SNR
store current iterate / cost value
plot/display stuffs
The present generic class implements a basic update method that:
display the iteration number
computes & display the cost (if activated)
computes & display the SNR if ground truth is provided
- Parameters:
name – name of the
OutputOpti
computecost – boolean, if true the cost function will be computed
evolcost – array to save the evolution of the cost function
saveXopt – boolean (default false) to save the evolution of the optimized variable xopt.
evolxopt – cell saving the optimization variable xopt
iterVerb – message will be displayed every iterVerb iterations (must be a multiple of the
ItUpOut
parameter of classesOpti
)costIndex – select a specific cost function among a sum in the case where the optimized cost function is a sum of cost functions
Example OutOpti=OutputOpti(computecost,iterVerb,costIndex)
Important The update method should have an unique imput that is the
Opti
object in order to be generic for all Optimization routines. Hence the update method has access (in reading mode) to all the properties ofOpti
objects.See also
Opti
OutputOptiConjGrad
- class Opti.OutputOpti.OutputOptiConjGrad
Bases:
Opti.OutputOpti.OutputOptiSNR
OutputOptiConjGrad class displayings and savings dedicated to for OptiConjGrad
The conjugate gradient algorithm minimizes the function $$ C(\mathrm{x})= \frac12 \mathrm{x^TAx - b^Tx} $$ However in many cases, it is often used to minimize: $$ F(\mathrm{x})= \frac12 \|H x - y\|^2_W $$ by setting: $$\mathrm{A} = \mathrm{H^T W H} \quad \text{and}\quad \mathrm{b = H^T W y}$$ An OutputOptiConjGrad object compute the cost F instead of the cost C.
- Parameters:
computecost – boolean, if true the cost function will be computed
xtrue – ground truth to compute the error with the solution (if provided)
iterVerb – message will be displayed every iterVerb iterations (must be a multiple of the
ItUpOut
parameter of classesOpti
)ytWy – weighted norm of $y$ : $ \mathrm{ytWy} = \mathrm{ y^T\,W\,y}$
See also
OptiConjGrad
OutputOpti
TestCvg
TestCvg (Default)
- class Opti.TestCvg.TestCvg
Bases:
matlab.mixin.Copyable
TestCvg class monitor convergence criterion during optimization
At each iterations of an optimization algorithm (see
Opti
generic class), thetestConvergence()
method of anTestCvg
object will be executed in order to acheive user defined computationsExample CvOpti=TestCvg()
Important The
testConvergence()
method should have an unique imput that is theOpti
object in order to be generic for all Optimization routines. Hence the update method has access (in reading mode) to all the properties ofOpti
objects.See also
Opti
- Method Summary
- testConvergence(this, opti)
Default implementation: do nothing (algorithm will break with max iterations).
TestCvgCombine
- class Opti.TestCvg.TestCvgCombine
Bases:
Opti.TestCvg.TestCvg
TestCvgCombine: Combine several
TestCvg
objects.Examples
CvOpti = TestCvgCombine(A, B ,C); where A B and C are of TestCvg class
CvOpti = TestCvgCombine(‘CostRelative’,0.000001, ‘CostAbsolute’,10); for simple test
See also
TestCvg
TestCvgCostAbsolute
- class Opti.TestCvg.TestCvgCostAbsolute
Bases:
Opti.TestCvg.TestCvg
TestCvgCostAbsolute stops the optimization when the cost function is below the value COSTABSOLUTETOL
- Parameters:
costAbsoluteTol – absolute tolerance on cost function
costIndex – select a specific cost function among a sum in the case where the optimized cost function is a sum of cost functions
Example CvOpti=TestCvgCostAbsolute(costAbsoluteTol, costIndex )
See also
TestCvg
- Method Summary
- testConvergence(this, opti)
Tests algorithm convergence
- Returns:
boolean true if \( C(\mathrm{x^k}) < \mathrm{costAbsoluteTol}\)
TestCvgCostRelative
- class Opti.TestCvg.TestCvgCostRelative
Bases:
Opti.TestCvg.TestCvg
TestCvgCostRelative stops the optimization when the cost function is below the value COSTRELATIVETOL
- Parameters:
costRelativeTol – relative tolerance on cost function
costIndex – select a specific cost function among a sum in the case where the optimized cost function is a sum of cost functions
Example CvOpti=TestCvgCostRelative(costRelativeTol, costIndex )
See also
TestCvg
- Method Summary
- testConvergence(this, opti)
Tests algorithm convergence from the relative difference between two successive value of the cost function
- Returns:
boolean true if $$ \frac{\left| C(\mathrm{x}^{k}) - C(\mathrm{x}^{k-1})\right|}{\left|C(\mathrm{x}^{k-1})\right|} < \mathrm{costRelativeTol}$$
TestCvgStepRelative
- class Opti.TestCvg.TestCvgStepRelative
Bases:
Opti.TestCvg.TestCvg
TestCvgStepRelative stops the optimization when the step is below the value STEPRELATIVETOL
- Parameters:
stepRelativeTol – relative tolerance on step
Example CvOpti=TestCvgStepRelative(stepRelativeTol )
See also
TestCvg
- Method Summary
- testConvergence(this, opti)
Tests algorithm convergence from the relative difference between two successive value of the step function
- Returns:
boolean true if $$ \frac{\| \mathrm{x}^{k} - \mathrm{x}^{k-1}\|}{\|\mathrm{x}^{k-1}\|} < \text{stepRelativeTol}.$$
TestCvgMaxSnr
- class Opti.TestCvg.TestCvgMaxSnr
Bases:
Opti.TestCvg.TestCvg
TestCvgMaxSnr stops the optimization when the SNR is decreasing
- Parameters:
ref – reference signal
Example CvOpti=TestCvgMaxSnr(ref )
See also
TestCvg
- Method Summary
- testConvergence(this, opti)
Tests algorithm convergence using the SNR with
- Returns:
boolean true if $$ 20\log\left(\frac{\| \mathrm{ref}\|}{\| \mathrm{x}^{k} - \mathrm{ref}\|}\right)< 20\log\left(\frac{\| \mathrm{ref}\|}{\| \mathrm{x}^{k-1} - \mathrm{ref}\|}\right)$$
TestCvgADMM
- class Opti.TestCvg.TestCvgADMM
Bases:
Opti.TestCvg.TestCvg
Test convergence by monitoring the primal and dual rediduals in ADMM as described in [1].
- Parameters:
eps_abs – absolute tolerance (default 0, see [1])
eps_rel – relative tolerance (default 1e-3 see [1])
Reference
[1] Boyd, Stephen, et al. “Distributed optimization and statistical learning via the alternating direction method of multipliers.” Foundations and Trends in Machine Learning, 2011.
Warning the termination criterion described in [1] requires to apply the adjoint of Hn at every iteration, which may be costly
Example CvOpti=TestCvgADMM(eps_abs,eps_rel)
See also
TestCvg