-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In-place *_jacvec!
assumes square jacobian
#184
Comments
I think manual cache allocation is what needs to be done, because that's how the output vector would then be defined. Does that make this a documentation issue?
Possibly, or it should be an AbstractDifferentation.jl interface piece, and all AD packages should then adopt that interface. |
A documentation fix would definitely make it easier to preallocate, which is useful regardless, but I think it's possible to make the allocating version work too. Just replace cache2 = Dual{typeof(ForwardDiff.Tag(DeivVecTag(),eltype(dy))),eltype(dy),1}.(dy, ForwardDiff.Partials.(tuple.(dy))), Here's the adaptation I landed on for my own use. I factored out the cache allocation into a function that can be used separately, making preallocation easy. If upstreamed to ForwardDiff this would of course turn into something like a struct MyTag end
function jvp!(f!, dy, x, v, duals=nothing)
if duals === nothing
duals = jvpduals(dy, v, x)
else # jvpduals returns initialized duals, so this init is only needed when preallocated
duals.x .= ForwardDiff.Dual{_tagtype(x), eltype(x), 1}.(
x, ForwardDiff.Partials.(tuple.(v))
)
end
f!(duals.dy, duals.x)
dy .= ForwardDiff.partials.(duals.dy, 1)
return dy
end
function jvpduals(dy, x, v)
dualdy = ForwardDiff.Dual{_tagtype(dy), eltype(dy), 1}.(
dy, ForwardDiff.Partials.(tuple.(dy))
)
# This correctly initializes dualx for computing J_f(x) * v
dualx = ForwardDiff.Dual{_tagtype(x), eltype(x), 1}.(
x, ForwardDiff.Partials.(tuple.(v))
)
return (; dy=dualdy, x=dualx)
end
_tagtype(x) = typeof(ForwardDiff.Tag(MyTag(), eltype(x))) |
That does seem like a good fix. Mind opening a PR? |
Been procrastinating on this, since my own version is still developing along with my project (including things like allocation-free 5-arg accumulating jvp, and consistent treatment of in-place vs. immutable versions), and I notice that the landscape of linear operator types is in rapid flux at the moment, at least on the SciML end of things (stoked for https://github.com/SciML/SciMLOperators.jl 🔥). So unless an improvement on this is urgent for anyone else I'll hold off a little and come back when it's more clear how things should look. |
There's been a lot of motion in this ocean in the last few weeks. Indeed, This change might want to wait until the end of the month until SciMLOperators is complete and registered. |
Sparse differentiation is now a part of DifferentiationInterface, with automatic cache allocation and (tested) support for non-square Jacobians |
auto_jacvec!
andnum_jacvec!
only work when the output and input dimensions are the same, i.e., the jacobian is square. The issue is this line (and the equivalent fornum_jacvec!
), which should probably use both the length and eltype ofdy
.SparseDiffTools.jl/src/differentiation/jaches_products.jl
Line 10 in 2dbe420
Manual cache allocation works but is surprisingly hard to get right, the implicit prescription in the README gets the wrong tag type for the duals: https://github.com/JuliaDiff/SparseDiffTools.jl#jacobian-vector-and-hessian-vector-products
The corresponding lines in
JacVec
should similarly be fixed:SparseDiffTools.jl/src/differentiation/jaches_products.jl
Line 201 in 2dbe420
auto_jacvec(!)
be upstreamed to ForwardDiff? Strange to have to pull in a sparse-related package to get this functionality. Calculating J_f(x) . y efficiently ForwardDiff.jl#319MWE:
The text was updated successfully, but these errors were encountered: