Augmented Lagrangian #47

jameskermode · 2017-02-14T17:54:16Z

This is just a reminder to go back and try JuLIP with the augmented Lagrangian implementation of constrained optimisation from https://github.com/cortner/ConstrainedOptim.jl

Related to issue #33

cortner · 2017-02-14T18:40:55Z

Good point - do you have a test system?

jameskermode · 2017-02-15T09:29:04Z

A constrained bond length would be the first canonical example. I will set something up and let you know if/when there are problems.

cortner · 2018-03-12T23:18:13Z

there is now a new ConstrainedOptim which we can try at some point. But I'm assuming the basic Newton scheme we have in Experimental works for now?

IL027 · 2018-03-15T18:47:14Z

@lifelemons is still having robustness problems with the experimental constrained Newton scheme, even with the gradient-based linesearch, so I think we will give ConstrainedOptim a go.

IL027 · 2018-03-15T18:47:44Z

(Oops, this is @jameskermode, signed in with wrong GitHub account)

cortner · 2018-03-15T19:06:19Z

This is the crack problem, yes? Maybe @lifelemons could remind me how I can access it and I’ll have another look?

jameskermode · 2018-04-05T14:56:55Z

Here is an attempt at using ConstrainedOptim.jl to fix the length of a bond in a piece of bulk SW silicon with latest JuLIP. The optimiser converges and penalty goes to zero, but the constraint is not satisfied afterwards within the target threshold of +/- 1e-2, so I must be doing something wrong. I've tried increasing strength of initial penalty.

@cortner would you be willing to take a look?

Final output I get is

b = Any[2.35126, 2.39, 2.35126, 2.35126]
E = Any[-34.68, -34.6702, -34.68, -34.68]

using ConstrainedOptim
using JuLIP

a = bulk(:Si, cubic=true)
set_constraint!(a, FixedCell(a))
set_calculator!(a, StillingerWeber())

i = 1
j = 2
I1 = atomdofs(a, i)
I2 = atomdofs(a, j)
x0 = dofs(a)

@show I1
@show I2

I1I2 = [I1; I2]

blen(x) = norm(x[I2] - x[I1])

# energy, gradient and hessian for the interatomic potential
ener(x) = energy(a, x)
gradient!(g, x) = (g[:] = gradient(a, x); g)
hessian!(h, x) = (h[:,:] = hessian(a, x); h)

b = []
E = []

for target_bondlength = 2.3:0.1:2.6

    # constraint function
    con_c!(c, x) = (c[1] = blen(x) - target_bondlength; c)

    # Jacobian of constraint, shape [1, length(x)]
    function con_jacobian!(J, x)
        J[1,I1] = (x[I1]-x[I2])/blen(x)
        J[1,I2] = (x[I2]-x[I1])/blen(x)
        J
    end

    # Hessian contribution of constraint
    # We define auxiliary index arrays _I1 and _I2 that start at 1 to
    # allow the ForwardDiff hessian to be done only on relevent dofs, x[I1I2]
    _I1 = 1:length(I1)
    _I2 = length(I1)+1:length(I1)+length(I2)
    _blen(x) = norm(x[_I2] - x[_I1])
    _c(x) = _blen(x) - target_bondlength

    function con_h!(h, x, λ)
        h[I1I2, I1I2] += λ[1] * ForwardDiff.hessian(_c, x[I1I2])
    end

    df = TwiceDifferentiable(ener, gradient!, hessian!, x0)
    lx = Float64[]; ux = Float64[]
    lc = [-1e-2]; uc = [1e-2]
    dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
                                         lx, ux, lc, uc)

    # move atoms i and j so constraint is initially satisfied
    set_dofs!(a, x0)
    X = positions(a)
    u = a[j] - a[i]
    f = 1.0 - target_bondlength / norm(u)
    X[i] += 0.5 * f * u
    X[j] -= 0.5 * f * u
    set_positions!(a, X)
    x = dofs(a)
    @show target_bondlength, blen(x)
    
    res = optimize(df, dfc, x, IPNewton(show_linesearch=false, μ0=:auto), 
                    Optim.Options(show_trace = true, 
                                  allow_f_increases = true, 
                                  successive_f_tol = 2))
    @show res
    @show _c(res.minimizer[I1I2])
    @show blen(res.minimizer)
    
    push!(b, blen(res.minimizer))
    push!(E, ener(res.minimizer))
end

@show b
@show E

jameskermode · 2018-04-05T15:25:33Z

Think I've got it working, mistake was not zeroing all elements of constraint Jacobian:

function con_jacobian!(J, x)
    J[1,:] = 0.0
    J[1,I1] = (x[I1]-x[I2])/blen(x)
    J[1,I2] = (x[I2]-x[I1])/blen(x)
end

lifelemons · 2018-04-05T15:28:45Z

What about the hessian, would one need to zero that out too? Does that now show something more reasonable?

cortner · 2018-04-05T15:29:30Z

let me know if I still need to look at it?

if this works, then it might be time to discuss how to best implement a more generic set of constraints that can be conveniently accessed? I.e. #20

jameskermode · 2018-04-05T15:30:03Z

No, slightly confusingly the constraint Hessian contribution is additive, so that goes on top of the one from the objective, while the constraint Jacobian needs the complete vector.

jameskermode · 2018-04-05T15:40:48Z

Thanks Christoph- working fine now, so no need to look at, but agree it would be good to implement a generalisation of the function into JuLIP in a more convenient way.

function minimize_constrained_bond!(a, i, j, bondlength)
    I1 = atomdofs(a, i)
    I2 = atomdofs(a, j)
    I1I2 = [I1; I2]

    # bondlength between atoms i and j
    blen(x) = norm(x[I2] - x[I1])

    # energy, gradient and hessian for the interatomic potential
    ener(x) = energy(a, x)
    gradient!(g, x) = (g[:] = gradient(a, x); g)
    hessian!(h, x) = (h[:,:] = hessian(a, x); h)
    
    # constraint function
    con_c!(c, x) = (c[1] = blen(x) - bondlength; c)

    # Jacobian of constraint, shape [1, length(x)]
    function con_jacobian!(J, x)
        J[1,:] = 0.0
        J[1,I1] = (x[I1]-x[I2])/blen(x)
        J[1,I2] = (x[I2]-x[I1])/blen(x)
    end

    # Hessian contribution of constraint
    # We define auxiliary index arrays _I1 and _I2 that start at 1 to
    # allow the ForwardDiff hessian to be done only on relevent dofs, x[I1I2]
    _I1 = 1:length(I1)
    _I2 = length(I1)+1:length(I1)+length(I2)
    _blen(x) = norm(x[_I2] - x[_I1])
    _c(x) = _blen(x) - bondlength

    function con_h!(h, x, λ)
        h[I1I2, I1I2] += λ[1] * ForwardDiff.hessian(_c, x[I1I2])
    end

    df = TwiceDifferentiable(ener, gradient!, hessian!, x0)
    lx = Float64[]; ux = Float64[]
    lc = [-1e-2]; uc = [1e-2]
    dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
                                         lx, ux, lc, uc)

    # move atoms i and j so constraint is initially satisfied
    X = positions(a)
    u = X[j] - X[i]
    f = 1.0 - bondlength / norm(u)
    X[i] += 0.5 * f * u
    X[j] -= 0.5 * f * u
    set_positions!(a, X)
    x = dofs(a)
    
    res = optimize(df, dfc, x, IPNewton(show_linesearch=false, μ0=:auto), 
                    Optim.Options(show_trace = true,
                                  extended_trace = false,
                                  allow_f_increases = true, 
                                  successive_f_tol = 2))
    set_dofs!(a, res.minimizer)
end

Meanwhile, @lifelemons please give this a go for your use case and let me know of any problems

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Augmented Lagrangian #47

Augmented Lagrangian #47

jameskermode commented Feb 14, 2017

cortner commented Feb 14, 2017

jameskermode commented Feb 15, 2017

cortner commented Mar 12, 2018

IL027 commented Mar 15, 2018

IL027 commented Mar 15, 2018

cortner commented Mar 15, 2018

jameskermode commented Apr 5, 2018

jameskermode commented Apr 5, 2018

lifelemons commented Apr 5, 2018

cortner commented Apr 5, 2018 •

edited

Loading

jameskermode commented Apr 5, 2018

jameskermode commented Apr 5, 2018 •

edited

Loading

Augmented Lagrangian #47

Augmented Lagrangian #47

Comments

jameskermode commented Feb 14, 2017

cortner commented Feb 14, 2017

jameskermode commented Feb 15, 2017

cortner commented Mar 12, 2018

IL027 commented Mar 15, 2018

IL027 commented Mar 15, 2018

cortner commented Mar 15, 2018

jameskermode commented Apr 5, 2018

jameskermode commented Apr 5, 2018

lifelemons commented Apr 5, 2018

cortner commented Apr 5, 2018 • edited Loading

jameskermode commented Apr 5, 2018

jameskermode commented Apr 5, 2018 • edited Loading

cortner commented Apr 5, 2018 •

edited

Loading

jameskermode commented Apr 5, 2018 •

edited

Loading