Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid mutation on user x #280

Merged
merged 1 commit into from
Dec 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 12 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,7 @@ H = numauto_color_hessian(f, x, colorvec, sparsity)
numauto_color_hessian!(H, f, x, colorvec, sparsity)
```

To avoid unnecessary allocations every time the Hessian is computed,
To avoid unnecessary allocations every time the Hessian is computed,
construct a `ForwardColorHesCache` beforehand:

```julia
Expand All @@ -292,7 +292,7 @@ numauto_color_hessian!(H, f, x, hescache)

By default, these methods use a mix of numerical and automatic differentiation,
namely by taking finite differences of gradients calculated via ForwardDiff.jl.
Alternatively, if you have your own custom gradient function `g!`, you can specify
Alternatively, if you have your own custom gradient function `g!`, you can specify
it as an argument to `ForwardColorHesCache`:

```julia
Expand All @@ -301,7 +301,6 @@ hescache = ForwardColorHesCache(f, x, colorvec, sparsity, g!)
Note that any user-defined gradient needs to have the signature `g!(G, x)`,
i.e. updating the gradient `G` in place.


### Jacobian-Vector and Hessian-Vector Products

Matrix-free implementations of Jacobian-Vector and Hessian-Vector products is
Expand All @@ -322,7 +321,8 @@ auto_jacvec(f, x, v)

# If compute_f0 is false, then `f(cache1,x)` will be computed
num_jacvec!(dy,f,x,v,cache1 = similar(v),
cache2 = similar(v);
cache2 = similar(v),
cache3 = similar(v);
compute_f0 = true)
num_jacvec(f,x,v,f0=nothing)
```
Expand All @@ -333,14 +333,16 @@ For Hessians, the following are provided:
num_hesvec!(dy,f,x,v,
cache1 = similar(v),
cache2 = similar(v),
cache3 = similar(v))
cache3 = similar(v),
cache4 = similar(v))

num_hesvec(f,x,v)

numauto_hesvec!(dy,f,x,v,
cache = ForwardDiff.GradientConfig(f,v),
cache1 = similar(v),
cache2 = similar(v))
cache2 = similar(v),
cache3 = similar(v))

numauto_hesvec(f,x,v)

Expand All @@ -358,6 +360,7 @@ respectively:

```julia
num_hesvecgrad!(dy,g,x,v,
cache1 = similar(v),
cache2 = similar(v),
cache3 = similar(v))

Expand All @@ -384,7 +387,8 @@ using Zygote # Required

numback_hesvec!(dy,f,x,v,
cache1 = similar(v),
cache2 = similar(v))
cache2 = similar(v),
cache3 = similar(v))

numback_hesvec(f,x,v)

Expand All @@ -396,7 +400,7 @@ autoback_hesvec!(dy,f,x,v,
autoback_hesvec(f,x,v)
```

#### J*v and H*v Operators
#### `J*v` and `H*v` Operators

The following produce matrix-free operators which are used for calculating
Jacobian-vector and Hessian-vector products where the differentiation takes
Expand Down
22 changes: 13 additions & 9 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -244,7 +244,7 @@ forwarddiff_color_jacobian!(J::AbstractMatrix{<:Number},
`dx` is a pre-allocated output vector which is used to declare the output size,
and thus allows for specifying a non-square Jacobian.

Also, it is possible retrieve the function value via `value(jac_cache)` or
Also, it is possible retrieve the function value via `value(jac_cache)` or
`value!(result, jac_cache)`


Expand Down Expand Up @@ -276,7 +276,7 @@ H = numauto_color_hessian(f, x, colorvec, sparsity)
numauto_color_hessian!(H, f, x, colorvec, sparsity)
```

To avoid unnecessary allocations every time the Hessian is computed,
To avoid unnecessary allocations every time the Hessian is computed,
construct a `ForwardColorHesCache` beforehand:

```julia
Expand All @@ -286,7 +286,7 @@ numauto_color_hessian!(H, f, x, hescache)

By default, these methods use a mix of numerical and automatic differentiation,
namely by taking finite differences of gradients calculated via ForwardDiff.jl.
Alternatively, if you have your own custom gradient function `g!`, you can specify
Alternatively, if you have your own custom gradient function `g!`, you can specify
it as an argument to `ForwardColorHesCache`:

```julia
Expand All @@ -295,7 +295,6 @@ hescache = ForwardColorHesCache(f, x, colorvec, sparsity, g!)
Note that any user-defined gradient needs to have the signature `g!(G, x)`,
i.e. updating the gradient `G` in place.


### Jacobian-Vector and Hessian-Vector Products

Matrix-free implementations of Jacobian-Vector and Hessian-Vector products is
Expand All @@ -316,7 +315,8 @@ auto_jacvec(f, x, v)

# If compute_f0 is false, then `f(cache1,x)` will be computed
num_jacvec!(dy,f,x,v,cache1 = similar(v),
cache2 = similar(v);
cache2 = similar(v),
cache3 = similar(v);
compute_f0 = true)
num_jacvec(f,x,v,f0=nothing)
```
Expand All @@ -327,14 +327,16 @@ For Hessians, the following are provided:
num_hesvec!(dy,f,x,v,
cache1 = similar(v),
cache2 = similar(v),
cache3 = similar(v))
cache3 = similar(v),
cache4 = similar(v))

num_hesvec(f,x,v)

numauto_hesvec!(dy,f,x,v,
cache = ForwardDiff.GradientConfig(f,v),
cache1 = similar(v),
cache2 = similar(v))
cache2 = similar(v),
cache3 = similar(v))

numauto_hesvec(f,x,v)

Expand All @@ -352,6 +354,7 @@ respectively:

```julia
num_hesvecgrad!(dy,g,x,v,
cache1 = similar(v),
cache2 = similar(v),
cache3 = similar(v))

Expand All @@ -378,7 +381,8 @@ using Zygote # Required

numback_hesvec!(dy,f,x,v,
cache1 = similar(v),
cache2 = similar(v))
cache2 = similar(v),
cache3 = similar(v))

numback_hesvec(f,x,v)

Expand All @@ -390,7 +394,7 @@ autoback_hesvec!(dy,f,x,v,
autoback_hesvec(f,x,v)
```

#### J*v and H*v Operators
#### `J*v` and `H*v` Operators

The following produce matrix-free operators which are used for calculating
Jacobian-vector and Hessian-vector products where the differentiation takes
Expand Down
11 changes: 5 additions & 6 deletions ext/SparseDiffToolsZygoteExt.jl
Original file line number Diff line number Diff line change
Expand Up @@ -45,18 +45,17 @@

### Jac, Hes products

function numback_hesvec!(dy, f::F, x, v, cache1 = similar(v), cache2 = similar(v)) where {F}
function numback_hesvec!(dy, f::F, x, v, cache1 = similar(v), cache2 = similar(v), cache3 = similar(v)) where {F}

Check warning on line 48 in ext/SparseDiffToolsZygoteExt.jl

View check run for this annotation

Codecov / codecov/patch

ext/SparseDiffToolsZygoteExt.jl#L48

Added line #L48 was not covered by tests
g = let f = f
(dx, x) -> dx .= first(Zygote.gradient(f, x))
end
T = eltype(x)
# Should it be min? max? mean?
ϵ = sqrt(eps(real(T))) * max(one(real(T)), abs(norm(x)))
@. x += ϵ * v
g(cache1, x)
@. x -= 2ϵ * v
g(cache2, x)
@. x += ϵ * v
@. cache3 = x + ϵ * v
g(cache1, cache3)
@. cache3 = x - ϵ * v
g(cache2, cache3)

Check warning on line 58 in ext/SparseDiffToolsZygoteExt.jl

View check run for this annotation

Codecov / codecov/patch

ext/SparseDiffToolsZygoteExt.jl#L55-L58

Added lines #L55 - L58 were not covered by tests
@. dy = (cache1 - cache2) / (2ϵ)
end

Expand Down
42 changes: 19 additions & 23 deletions src/differentiation/jaches_products.jl
Original file line number Diff line number Diff line change
Expand Up @@ -32,16 +32,15 @@
vec(partials.(vec(f(y)), 1))
end

function num_jacvec!(dy, f, x, v, cache1 = similar(v), cache2 = similar(v);
function num_jacvec!(dy, f, x, v, cache1 = similar(v), cache2 = similar(v), cache3 = similar(v);

Check warning on line 35 in src/differentiation/jaches_products.jl

View check run for this annotation

Codecov / codecov/patch

src/differentiation/jaches_products.jl#L35

Added line #L35 was not covered by tests
compute_f0 = true)
vv = reshape(v, axes(x))
compute_f0 && (f(cache1, x))
T = eltype(x)
# Should it be min? max? mean?
ϵ = sqrt(eps(real(T))) * max(one(real(T)), abs(norm(x)))
@. x += ϵ * vv
f(cache2, x)
@. x -= ϵ * vv
@. cache3 = x + ϵ * vv
f(cache2, cache3)

Check warning on line 43 in src/differentiation/jaches_products.jl

View check run for this annotation

Codecov / codecov/patch

src/differentiation/jaches_products.jl#L42-L43

Added lines #L42 - L43 were not covered by tests
vecdy = _vec(dy)
veccache1 = _vec(cache1)
veccache2 = _vec(cache2)
Expand All @@ -58,19 +57,18 @@
end

function num_hesvec!(dy, f, x, v, cache1 = similar(v), cache2 = similar(v),
cache3 = similar(v))
cache3 = similar(v), cache4 = similar(v))
cache = FiniteDiff.GradientCache(v[1], cache1, Val{:central})
g = let f = f, cache = cache
(dx, x) -> FiniteDiff.finite_difference_gradient!(dx, f, x, cache)
end
T = eltype(x)
# Should it be min? max? mean?
ϵ = sqrt(eps(real(T))) * max(one(real(T)), abs(norm(x)))
@. x += ϵ * v
g(cache2, x)
@. x -= 2ϵ * v
g(cache3, x)
@. x += ϵ * v
@. cache4 = x + ϵ * v
g(cache2, cache4)
@. cache4 = x - ϵ * v
g(cache3, cache4)

Check warning on line 71 in src/differentiation/jaches_products.jl

View check run for this annotation

Codecov / codecov/patch

src/differentiation/jaches_products.jl#L68-L71

Added lines #L68 - L71 were not covered by tests
@. dy = (cache2 - cache3) / (2ϵ)
end

Expand All @@ -87,18 +85,17 @@
end

function numauto_hesvec!(dy, f, x, v, cache = ForwardDiff.GradientConfig(f, v),
cache1 = similar(v), cache2 = similar(v))
cache1 = similar(v), cache2 = similar(v), cache3 = similar(v))
g = let f = f, x = x, cache = cache
g = (dx, x) -> ForwardDiff.gradient!(dx, f, x, cache)
end
T = eltype(x)
# Should it be min? max? mean?
ϵ = sqrt(eps(real(T))) * max(one(real(T)), abs(norm(x)))
@. x += ϵ * v
g(cache1, x)
@. x -= 2ϵ * v
g(cache2, x)
@. x += ϵ * v
@. cache3 = x + ϵ * v
g(cache1, cache3)
@. cache3 = x - ϵ * v
g(cache2, cache3)

Check warning on line 98 in src/differentiation/jaches_products.jl

View check run for this annotation

Codecov / codecov/patch

src/differentiation/jaches_products.jl#L95-L98

Added lines #L95 - L98 were not covered by tests
@. dy = (cache1 - cache2) / (2ϵ)
end

Expand Down Expand Up @@ -137,16 +134,15 @@
partials.(g(Dual{DeivVecTag}.(x, v)), 1)
end

function num_hesvecgrad!(dy, g, x, v, cache2 = similar(v), cache3 = similar(v))
function num_hesvecgrad!(dy, g, x, v, cache1 = similar(v), cache2 = similar(v), cache3 = similar(v))

Check warning on line 137 in src/differentiation/jaches_products.jl

View check run for this annotation

Codecov / codecov/patch

src/differentiation/jaches_products.jl#L137

Added line #L137 was not covered by tests
T = eltype(x)
# Should it be min? max? mean?
ϵ = sqrt(eps(real(T))) * max(one(real(T)), abs(norm(x)))
@. x += ϵ * v
g(cache2, x)
@. x -= 2ϵ * v
g(cache3, x)
@. x += ϵ * v
@. dy = (cache2 - cache3) / (2ϵ)
@. cache3 = x + ϵ * v
g(cache1, cache3)
@. cache3 = x - ϵ * v
g(cache2, cache3)
@. dy = (cache1 - cache2) / (2ϵ)

Check warning on line 145 in src/differentiation/jaches_products.jl

View check run for this annotation

Codecov / codecov/patch

src/differentiation/jaches_products.jl#L141-L145

Added lines #L141 - L145 were not covered by tests
end

function num_hesvecgrad(g, x, v)
Expand Down
16 changes: 9 additions & 7 deletions src/differentiation/vecjac_products.jl
Original file line number Diff line number Diff line change
@@ -1,14 +1,15 @@
function num_vecjac!(du, f::F, x, v, cache1 = similar(v), cache2 = similar(v);
function num_vecjac!(du, f::F, x, v, cache1 = similar(v), cache2 = similar(v), cache3 = similar(v);

Check warning on line 1 in src/differentiation/vecjac_products.jl

View check run for this annotation

Codecov / codecov/patch

src/differentiation/vecjac_products.jl#L1

Added line #L1 was not covered by tests
compute_f0 = true) where {F}
compute_f0 && (f(cache1, x))
T = eltype(x)
# Should it be min? max? mean?
ϵ = sqrt(eps(real(T))) * max(one(real(T)), abs(norm(x)))
vv = reshape(v, size(cache1))
cache3 .= x

Check warning on line 8 in src/differentiation/vecjac_products.jl

View check run for this annotation

Codecov / codecov/patch

src/differentiation/vecjac_products.jl#L8

Added line #L8 was not covered by tests
for i in 1:length(x)
x[i] += ϵ
f(cache2, x)
x[i] -= ϵ
cache3[i] += ϵ
f(cache2, cache3)
cache3[i] = x[i]

Check warning on line 12 in src/differentiation/vecjac_products.jl

View check run for this annotation

Codecov / codecov/patch

src/differentiation/vecjac_products.jl#L10-L12

Added lines #L10 - L12 were not covered by tests
du[i] = (((cache2 .- cache1) ./ ϵ)' * vv)[1]
end
return du
Expand All @@ -21,10 +22,11 @@
# Should it be min? max? mean?
ϵ = sqrt(eps(real(T))) * max(one(real(T)), abs(norm(x)))
du = similar(x)
cache = copy(x)

Check warning on line 25 in src/differentiation/vecjac_products.jl

View check run for this annotation

Codecov / codecov/patch

src/differentiation/vecjac_products.jl#L25

Added line #L25 was not covered by tests
for i in 1:length(x)
x[i] += ϵ
cache[i] += ϵ

Check warning on line 27 in src/differentiation/vecjac_products.jl

View check run for this annotation

Codecov / codecov/patch

src/differentiation/vecjac_products.jl#L27

Added line #L27 was not covered by tests
f0 = f(x)
x[i] -= ϵ
cache[i] = x[i]

Check warning on line 29 in src/differentiation/vecjac_products.jl

View check run for this annotation

Codecov / codecov/patch

src/differentiation/vecjac_products.jl#L29

Added line #L29 was not covered by tests
du[i] = (((f0 .- _f0) ./ ϵ)' * vv)[1]
end
return vec(du)
Expand Down Expand Up @@ -91,7 +93,7 @@
end

function _vecjac(f::F, fu, u, autodiff::AutoFiniteDiff) where {F}
cache = (similar(fu), similar(fu))
cache = (similar(fu), similar(fu), similar(fu))

Check warning on line 96 in src/differentiation/vecjac_products.jl

View check run for this annotation

Codecov / codecov/patch

src/differentiation/vecjac_products.jl#L96

Added line #L96 was not covered by tests
pullback = nothing
return AutoDiffVJP(f, u, cache, autodiff, pullback)
end
Expand Down
Loading