Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Augmented Lagrangian #47

Open
jameskermode opened this issue Feb 14, 2017 · 12 comments
Open

Augmented Lagrangian #47

jameskermode opened this issue Feb 14, 2017 · 12 comments

Comments

@jameskermode
Copy link
Collaborator

This is just a reminder to go back and try JuLIP with the augmented Lagrangian implementation of constrained optimisation from https://github.com/cortner/ConstrainedOptim.jl

Related to issue #33

@cortner
Copy link
Member

cortner commented Feb 14, 2017

Good point - do you have a test system?

@jameskermode
Copy link
Collaborator Author

A constrained bond length would be the first canonical example. I will set something up and let you know if/when there are problems.

@cortner
Copy link
Member

cortner commented Mar 12, 2018

there is now a new ConstrainedOptim which we can try at some point. But I'm assuming the basic Newton scheme we have in Experimental works for now?

@IL027
Copy link

IL027 commented Mar 15, 2018

@lifelemons is still having robustness problems with the experimental constrained Newton scheme, even with the gradient-based linesearch, so I think we will give ConstrainedOptim a go.

@IL027
Copy link

IL027 commented Mar 15, 2018

(Oops, this is @jameskermode, signed in with wrong GitHub account)

@cortner
Copy link
Member

cortner commented Mar 15, 2018

This is the crack problem, yes? Maybe @lifelemons could remind me how I can access it and I’ll have another look?

@jameskermode
Copy link
Collaborator Author

Here is an attempt at using ConstrainedOptim.jl to fix the length of a bond in a piece of bulk SW silicon with latest JuLIP. The optimiser converges and penalty goes to zero, but the constraint is not satisfied afterwards within the target threshold of +/- 1e-2, so I must be doing something wrong. I've tried increasing strength of initial penalty.

@cortner would you be willing to take a look?

Final output I get is

b = Any[2.35126, 2.39, 2.35126, 2.35126]
E = Any[-34.68, -34.6702, -34.68, -34.68]
using ConstrainedOptim
using JuLIP

a = bulk(:Si, cubic=true)
set_constraint!(a, FixedCell(a))
set_calculator!(a, StillingerWeber())

i = 1
j = 2
I1 = atomdofs(a, i)
I2 = atomdofs(a, j)
x0 = dofs(a)

@show I1
@show I2

I1I2 = [I1; I2]

blen(x) = norm(x[I2] - x[I1])

# energy, gradient and hessian for the interatomic potential
ener(x) = energy(a, x)
gradient!(g, x) = (g[:] = gradient(a, x); g)
hessian!(h, x) = (h[:,:] = hessian(a, x); h)

b = []
E = []

for target_bondlength = 2.3:0.1:2.6

    # constraint function
    con_c!(c, x) = (c[1] = blen(x) - target_bondlength; c)

    # Jacobian of constraint, shape [1, length(x)]
    function con_jacobian!(J, x)
        J[1,I1] = (x[I1]-x[I2])/blen(x)
        J[1,I2] = (x[I2]-x[I1])/blen(x)
        J
    end

    # Hessian contribution of constraint
    # We define auxiliary index arrays _I1 and _I2 that start at 1 to
    # allow the ForwardDiff hessian to be done only on relevent dofs, x[I1I2]
    _I1 = 1:length(I1)
    _I2 = length(I1)+1:length(I1)+length(I2)
    _blen(x) = norm(x[_I2] - x[_I1])
    _c(x) = _blen(x) - target_bondlength

    function con_h!(h, x, λ)
        h[I1I2, I1I2] += λ[1] * ForwardDiff.hessian(_c, x[I1I2])
    end

    df = TwiceDifferentiable(ener, gradient!, hessian!, x0)
    lx = Float64[]; ux = Float64[]
    lc = [-1e-2]; uc = [1e-2]
    dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
                                         lx, ux, lc, uc)

    # move atoms i and j so constraint is initially satisfied
    set_dofs!(a, x0)
    X = positions(a)
    u = a[j] - a[i]
    f = 1.0 - target_bondlength / norm(u)
    X[i] += 0.5 * f * u
    X[j] -= 0.5 * f * u
    set_positions!(a, X)
    x = dofs(a)
    @show target_bondlength, blen(x)
    
    res = optimize(df, dfc, x, IPNewton(show_linesearch=false, μ0=:auto), 
                    Optim.Options(show_trace = true, 
                                  allow_f_increases = true, 
                                  successive_f_tol = 2))
    @show res
    @show _c(res.minimizer[I1I2])
    @show blen(res.minimizer)
    
    push!(b, blen(res.minimizer))
    push!(E, ener(res.minimizer))
end

@show b
@show E

@jameskermode
Copy link
Collaborator Author

Think I've got it working, mistake was not zeroing all elements of constraint Jacobian:

function con_jacobian!(J, x)
    J[1,:] = 0.0
    J[1,I1] = (x[I1]-x[I2])/blen(x)
    J[1,I2] = (x[I2]-x[I1])/blen(x)
end

@lifelemons
Copy link
Collaborator

What about the hessian, would one need to zero that out too? Does that now show something more reasonable?

@cortner
Copy link
Member

cortner commented Apr 5, 2018

let me know if I still need to look at it?

if this works, then it might be time to discuss how to best implement a more generic set of constraints that can be conveniently accessed? I.e. #20

@jameskermode
Copy link
Collaborator Author

No, slightly confusingly the constraint Hessian contribution is additive, so that goes on top of the one from the objective, while the constraint Jacobian needs the complete vector.

@jameskermode
Copy link
Collaborator Author

jameskermode commented Apr 5, 2018

Thanks Christoph- working fine now, so no need to look at, but agree it would be good to implement a generalisation of the function into JuLIP in a more convenient way.

function minimize_constrained_bond!(a, i, j, bondlength)
    I1 = atomdofs(a, i)
    I2 = atomdofs(a, j)
    I1I2 = [I1; I2]

    # bondlength between atoms i and j
    blen(x) = norm(x[I2] - x[I1])

    # energy, gradient and hessian for the interatomic potential
    ener(x) = energy(a, x)
    gradient!(g, x) = (g[:] = gradient(a, x); g)
    hessian!(h, x) = (h[:,:] = hessian(a, x); h)
    
    # constraint function
    con_c!(c, x) = (c[1] = blen(x) - bondlength; c)

    # Jacobian of constraint, shape [1, length(x)]
    function con_jacobian!(J, x)
        J[1,:] = 0.0
        J[1,I1] = (x[I1]-x[I2])/blen(x)
        J[1,I2] = (x[I2]-x[I1])/blen(x)
    end

    # Hessian contribution of constraint
    # We define auxiliary index arrays _I1 and _I2 that start at 1 to
    # allow the ForwardDiff hessian to be done only on relevent dofs, x[I1I2]
    _I1 = 1:length(I1)
    _I2 = length(I1)+1:length(I1)+length(I2)
    _blen(x) = norm(x[_I2] - x[_I1])
    _c(x) = _blen(x) - bondlength

    function con_h!(h, x, λ)
        h[I1I2, I1I2] += λ[1] * ForwardDiff.hessian(_c, x[I1I2])
    end

    df = TwiceDifferentiable(ener, gradient!, hessian!, x0)
    lx = Float64[]; ux = Float64[]
    lc = [-1e-2]; uc = [1e-2]
    dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
                                         lx, ux, lc, uc)

    # move atoms i and j so constraint is initially satisfied
    X = positions(a)
    u = X[j] - X[i]
    f = 1.0 - bondlength / norm(u)
    X[i] += 0.5 * f * u
    X[j] -= 0.5 * f * u
    set_positions!(a, X)
    x = dofs(a)
    
    res = optimize(df, dfc, x, IPNewton(show_linesearch=false, μ0=:auto), 
                    Optim.Options(show_trace = true,
                                  extended_trace = false,
                                  allow_f_increases = true, 
                                  successive_f_tol = 2))
    set_dofs!(a, res.minimizer)
end

Meanwhile, @lifelemons please give this a go for your use case and let me know of any problems

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants