Skip to content

Commit f58002a

Browse files
authored
Regret factor in ForwardBackward (#91)
This PR adds the possibility to specify a regret factor to increase the stepsize gamma at every iteration, if adaptive, before backtracking. Recent results provide convergence guarantees even without a maximum value on gamma [Theorem 14, [arXiv:2208.00779v2](https://arxiv.org/abs/2208.00799)]. This feature was implemented for ForwardBackward only. Although it does not seem a good fit for accelerated methods like ZeroFPR, PANOC, and PANOCplus, at least when based on quasi-Newton directions, it is unclear whether the FastForwardBackward solver could benefit from it. See discussion in #86 . Practical performance seems to improve with values regret_gamma close to 1. Tests and references have also been updated accordingly.
1 parent f093263 commit f58002a

File tree

7 files changed

+145
-49
lines changed

7 files changed

+145
-49
lines changed

docs/references.bib

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,3 +185,11 @@ @article{DeMarchi2022
185185
year={2022},
186186
url={https://doi.org/10.48550/arXiv.2112.13000}
187187
}
188+
189+
@article{DeMarchi2024,
190+
title={An interior proximal gradient method for nonconvex optimization},
191+
author={De Marchi, Alberto and Themelis, Andreas},
192+
journal={arXiv:2208.00799v2},
193+
year={2024},
194+
url={https://doi.org/10.48550/arXiv.2208.00799}
195+
}

docs/src/guide/implemented_algorithms.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ This is the most popular model, by far the most thoroughly studied, and an abund
2424

2525
Algorithm | Assumptions | Oracle | Implementation | References
2626
----------|-------------|--------|----------------|-----------
27-
Proximal gradient | ``f`` smooth | ``\nabla f``, ``\operatorname{prox}_{\gamma g}`` | [`ForwardBackward`](@ref) | [Lions1979](@cite)
27+
Proximal gradient | ``f`` locally smooth | ``\nabla f``, ``\operatorname{prox}_{\gamma g}`` | [`ForwardBackward`](@ref) | [Lions1979](@cite), [DeMarchi2024](@cite)
2828
Douglas-Rachford | | ``\operatorname{prox}_{\gamma f}``, ``\operatorname{prox}_{\gamma g}`` | [`DouglasRachford`](@ref) | [Eckstein1992](@cite)
2929
Fast proximal gradient | ``f`` convex, smooth, ``g`` convex | ``\nabla f``, ``\operatorname{prox}_{\gamma g}`` | [`FastForwardBackward`](@ref) | [Tseng2008](@cite), [Beck2009](@cite)
3030
PANOC | ``f`` smooth | ``\nabla f``, ``\operatorname{prox}_{\gamma g}`` | [`PANOC`](@ref) | [Stella2017](@cite)

src/algorithms/fast_forward_backward.jl

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ See also: [`FastForwardBackward`](@ref).
3333
- `gamma=nothing`: stepsize, defaults to `1/Lf` if `Lf` is set, and `nothing` otherwise.
3434
- `adaptive=true`: makes `gamma` adaptively adjust during the iterations; this is by default `gamma === nothing`.
3535
- `minimum_gamma=1e-7`: lower bound to `gamma` in case `adaptive == true`.
36+
- `regret_gamma=1.0`: factor to enlarge `gamma` in case `adaptive == true`, before backtracking.
3637
- `extrapolation_sequence=nothing`: sequence (iterator) of extrapolation coefficients to use for acceleration.
3738
3839
# References
@@ -48,6 +49,7 @@ Base.@kwdef struct FastForwardBackwardIteration{R,Tx,Tf,Tg,TLf,Tgamma,Textr}
4849
gamma::Tgamma = Lf === nothing ? nothing : (1 / Lf)
4950
adaptive::Bool = gamma === nothing
5051
minimum_gamma::R = real(eltype(x0))(1e-7)
52+
regret_gamma::R = real(eltype(x0))(1.0)
5153
extrapolation_sequence::Textr = nothing
5254
end
5355

@@ -105,6 +107,7 @@ function Base.iterate(
105107
state::FastForwardBackwardState{R,Tx},
106108
) where {R,Tx}
107109
state.gamma = if iter.adaptive == true
110+
state.gamma *= iter.regret_gamma
108111
gamma, state.g_z = backtrack_stepsize!(
109112
state.gamma,
110113
iter.f,

src/algorithms/forward_backward.jl

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,11 @@ See also: [`ForwardBackward`](@ref).
2828
- `gamma=nothing`: stepsize to use, defaults to `1/Lf` if not set (but `Lf` is).
2929
- `adaptive=false`: forces the method stepsize to be adaptively adjusted.
3030
- `minimum_gamma=1e-7`: lower bound to `gamma` in case `adaptive == true`.
31+
- `regret_gamma=1.0`: factor to enlarge `gamma` in case `adaptive == true`, before backtracking.
3132
3233
# References
3334
1. Lions, Mercier, “Splitting algorithms for the sum of two nonlinear operators,” SIAM Journal on Numerical Analysis, vol. 16, pp. 964–979 (1979).
35+
2. De Marchi, Themelis, "An interior proximal gradient method for nonconvex optimization," arXiv:2208.00799v2 (2024).
3436
"""
3537
Base.@kwdef struct ForwardBackwardIteration{R,Tx,Tf,Tg,TLf,Tgamma}
3638
f::Tf = Zero()
@@ -40,6 +42,7 @@ Base.@kwdef struct ForwardBackwardIteration{R,Tx,Tf,Tg,TLf,Tgamma}
4042
gamma::Tgamma = Lf === nothing ? nothing : (1 / Lf)
4143
adaptive::Bool = gamma === nothing
4244
minimum_gamma::R = real(eltype(x0))(1e-7)
45+
regret_gamma::R = real(eltype(x0))(1.0)
4346
end
4447

4548
Base.IteratorSize(::Type{<:ForwardBackwardIteration}) = Base.IsInfinite()
@@ -84,6 +87,7 @@ function Base.iterate(
8487
state::ForwardBackwardState{R,Tx},
8588
) where {R,Tx}
8689
if iter.adaptive == true
90+
state.gamma *= iter.regret_gamma
8791
state.gamma, state.g_z, state.f_x = backtrack_stepsize!(
8892
state.gamma,
8993
iter.f,
@@ -150,6 +154,7 @@ See also: [`ForwardBackwardIteration`](@ref), [`IterativeAlgorithm`](@ref).
150154
151155
# References
152156
1. Lions, Mercier, “Splitting algorithms for the sum of two nonlinear operators,” SIAM Journal on Numerical Analysis, vol. 16, pp. 964–979 (1979).
157+
2. De Marchi, Themelis, "An interior proximal gradient method for nonconvex optimization," arXiv:2208.00799v2 (2024).
153158
"""
154159
ForwardBackward(;
155160
maxit = 10_000,

test/problems/test_lasso_small.jl

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,17 @@ using ProximalAlgorithms:
6565
@test x0 == x0_backup
6666
end
6767

68+
@testset "ForwardBackward (adaptive step, regret)" begin
69+
x0 = zeros(T, n)
70+
x0_backup = copy(x0)
71+
solver = ProximalAlgorithms.ForwardBackward(tol = TOL, adaptive = true, regret_gamma=R(1.01))
72+
x, it = @inferred solver(x0 = x0, f = fA_autodiff, g = g)
73+
@test eltype(x) == T
74+
@test norm(x - x_star, Inf) <= TOL
75+
@test it < 150
76+
@test x0 == x0_backup
77+
end
78+
6879
@testset "FastForwardBackward (fixed step)" begin
6980
x0 = zeros(T, n)
7081
x0_backup = copy(x0)
@@ -87,6 +98,17 @@ using ProximalAlgorithms:
8798
@test x0 == x0_backup
8899
end
89100

101+
@testset "FastForwardBackward (adaptive step, regret)" begin
102+
x0 = zeros(T, n)
103+
x0_backup = copy(x0)
104+
solver = ProximalAlgorithms.FastForwardBackward(tol = TOL, adaptive = true, regret_gamma=R(1.01))
105+
x, it = @inferred solver(x0 = x0, f = fA_autodiff, g = g)
106+
@test eltype(x) == T
107+
@test norm(x - x_star, Inf) <= TOL
108+
@test it < 100
109+
@test x0 == x0_backup
110+
end
111+
90112
@testset "FastForwardBackward (custom extrapolation)" begin
91113
x0 = zeros(T, n)
92114
x0_backup = copy(x0)

test/problems/test_lasso_small_strongly_convex.jl

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,24 @@ using ProximalAlgorithms
7070
@test it < 110
7171
@test x0 == x0_backup
7272
end
73+
74+
@testset "ForwardBackward (adaptive step)" begin
75+
solver = ProximalAlgorithms.ForwardBackward(tol = TOL, adaptive = true)
76+
y, it = solver(x0 = x0, f = fA_autodiff, g = g)
77+
@test eltype(y) == T
78+
@test norm(y - x_star, Inf) <= TOL
79+
@test it < 300
80+
@test x0 == x0_backup
81+
end
82+
83+
@testset "ForwardBackward (adaptive step, regret)" begin
84+
solver = ProximalAlgorithms.ForwardBackward(tol = TOL, adaptive = true, regret_gamma=T(1.01))
85+
y, it = solver(x0 = x0, f = fA_autodiff, g = g)
86+
@test eltype(y) == T
87+
@test norm(y - x_star, Inf) <= TOL
88+
@test it < 80
89+
@test x0 == x0_backup
90+
end
7391

7492
@testset "FastForwardBackward" begin
7593
solver = ProximalAlgorithms.FastForwardBackward(tol = TOL)
@@ -80,6 +98,24 @@ using ProximalAlgorithms
8098
@test x0 == x0_backup
8199
end
82100

101+
@testset "FastForwardBackward (adaptive step)" begin
102+
solver = ProximalAlgorithms.FastForwardBackward(tol = TOL, adaptive = true)
103+
y, it = solver(x0 = x0, f = fA_autodiff, g = g)
104+
@test eltype(y) == T
105+
@test norm(y - x_star, Inf) <= TOL
106+
@test it < 100
107+
@test x0 == x0_backup
108+
end
109+
110+
@testset "FastForwardBackward (adaptive step, regret)" begin
111+
solver = ProximalAlgorithms.FastForwardBackward(tol = TOL, adaptive = true, regret_gamma=T(1.01))
112+
y, it = solver(x0 = x0, f = fA_autodiff, g = g)
113+
@test eltype(y) == T
114+
@test norm(y - x_star, Inf) <= TOL
115+
@test it < 100
116+
@test x0 == x0_backup
117+
end
118+
83119
@testset "FastForwardBackward (custom extrapolation)" begin
84120
solver = ProximalAlgorithms.FastForwardBackward(tol = TOL)
85121
y, it = solver(

test/problems/test_sparse_logistic_small.jl

Lines changed: 70 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -35,59 +35,81 @@ using LinearAlgebra
3535

3636
TOL = R(1e-6)
3737

38-
# Nonfast/Adaptive
39-
40-
x0 = zeros(T, n)
41-
x0_backup = copy(x0)
42-
solver = ProximalAlgorithms.ForwardBackward(tol = TOL, adaptive = true)
43-
x, it = solver(x0 = x0, f = fA_autodiff, g = g)
44-
@test eltype(x) == T
45-
@test norm(x - x_star, Inf) <= 1e-4
46-
@test it < 1100
47-
@test x0 == x0_backup
48-
49-
# Fast/Adaptive
50-
51-
x0 = zeros(T, n)
52-
x0_backup = copy(x0)
53-
solver = ProximalAlgorithms.FastForwardBackward(tol = TOL, adaptive = true)
54-
x, it = solver(x0 = x0, f = fA_autodiff, g = g)
55-
@test eltype(x) == T
56-
@test norm(x - x_star, Inf) <= 1e-4
57-
@test it < 500
58-
@test x0 == x0_backup
38+
@testset "ForwardBackward (adaptive step)" begin
39+
x0 = zeros(T, n)
40+
x0_backup = copy(x0)
41+
solver = ProximalAlgorithms.ForwardBackward(tol = TOL, adaptive = true)
42+
x, it = solver(x0 = x0, f = fA_autodiff, g = g)
43+
@test eltype(x) == T
44+
@test norm(x - x_star, Inf) <= 1e-4
45+
@test it < 1100
46+
@test x0 == x0_backup
47+
end
5948

60-
# ZeroFPR/Adaptive
49+
@testset "ForwardBackward (adaptive step, regret)" begin
50+
x0 = zeros(T, n)
51+
x0_backup = copy(x0)
52+
solver = ProximalAlgorithms.ForwardBackward(tol = TOL, adaptive = true, regret_gamma=R(1.01))
53+
x, it = solver(x0 = x0, f = fA_autodiff, g = g)
54+
@test eltype(x) == T
55+
@test norm(x - x_star, Inf) <= 1e-4
56+
@test it < 500
57+
@test x0 == x0_backup
58+
end
6159

62-
x0 = zeros(T, n)
63-
x0_backup = copy(x0)
64-
solver = ProximalAlgorithms.ZeroFPR(adaptive = true, tol = TOL)
65-
x, it = solver(x0 = x0, f = f_autodiff, A = A, g = g)
66-
@test eltype(x) == T
67-
@test norm(x - x_star, Inf) <= 1e-4
68-
@test it < 25
69-
@test x0 == x0_backup
60+
@testset "FastForwardBackward (adaptive step)" begin
61+
x0 = zeros(T, n)
62+
x0_backup = copy(x0)
63+
solver = ProximalAlgorithms.FastForwardBackward(tol = TOL, adaptive = true)
64+
x, it = solver(x0 = x0, f = fA_autodiff, g = g)
65+
@test eltype(x) == T
66+
@test norm(x - x_star, Inf) <= 1e-4
67+
@test it < 500
68+
@test x0 == x0_backup
69+
end
7070

71-
# PANOC/Adaptive
71+
@testset "FastForwardBackward (adaptive step, regret)" begin
72+
x0 = zeros(T, n)
73+
x0_backup = copy(x0)
74+
solver = ProximalAlgorithms.FastForwardBackward(tol = TOL, adaptive = true, regret_gamma=R(1.01))
75+
x, it = solver(x0 = x0, f = fA_autodiff, g = g)
76+
@test eltype(x) == T
77+
@test norm(x - x_star, Inf) <= 1e-4
78+
@test it < 200
79+
@test x0 == x0_backup
80+
end
7281

73-
x0 = zeros(T, n)
74-
x0_backup = copy(x0)
75-
solver = ProximalAlgorithms.PANOC(adaptive = true, tol = TOL)
76-
x, it = solver(x0 = x0, f = f_autodiff, A = A, g = g)
77-
@test eltype(x) == T
78-
@test norm(x - x_star, Inf) <= 1e-4
79-
@test it < 50
80-
@test x0 == x0_backup
82+
@testset "ZeroFPR (adaptive step)" begin
83+
x0 = zeros(T, n)
84+
x0_backup = copy(x0)
85+
solver = ProximalAlgorithms.ZeroFPR(adaptive = true, tol = TOL)
86+
x, it = solver(x0 = x0, f = f_autodiff, A = A, g = g)
87+
@test eltype(x) == T
88+
@test norm(x - x_star, Inf) <= 1e-4
89+
@test it < 25
90+
@test x0 == x0_backup
91+
end
8192

82-
# PANOCplus/Adaptive
93+
@testset "PANOC (adaptive step)" begin
94+
x0 = zeros(T, n)
95+
x0_backup = copy(x0)
96+
solver = ProximalAlgorithms.PANOC(adaptive = true, tol = TOL)
97+
x, it = solver(x0 = x0, f = f_autodiff, A = A, g = g)
98+
@test eltype(x) == T
99+
@test norm(x - x_star, Inf) <= 1e-4
100+
@test it < 50
101+
@test x0 == x0_backup
102+
end
83103

84-
x0 = zeros(T, n)
85-
x0_backup = copy(x0)
86-
solver = ProximalAlgorithms.PANOCplus(adaptive = true, tol = TOL)
87-
x, it = solver(x0 = x0, f = f_autodiff, A = A, g = g)
88-
@test eltype(x) == T
89-
@test norm(x - x_star, Inf) <= 1e-4
90-
@test it < 50
91-
@test x0 == x0_backup
104+
@testset "PANOCplus (adaptive step)" begin
105+
x0 = zeros(T, n)
106+
x0_backup = copy(x0)
107+
solver = ProximalAlgorithms.PANOCplus(adaptive = true, tol = TOL)
108+
x, it = solver(x0 = x0, f = f_autodiff, A = A, g = g)
109+
@test eltype(x) == T
110+
@test norm(x - x_star, Inf) <= 1e-4
111+
@test it < 50
112+
@test x0 == x0_backup
113+
end
92114

93115
end

0 commit comments

Comments
 (0)