Optimise 1d Gauss-Legendre interpolation #216

johnomotani · 2024-05-20T21:47:17Z

Precalculate some quantities to minimise the amount of work needed to evaluate the Lagrange polynomials, and hopefully to make the loops vectorise better. Speeds up interpolation by about 3x in one case that I profiled.

Precalculate some quantities to minimise the amount of work needed to evaluate the Lagrange polynomials, and hopefully to make the loops vectorise better.

mrhardman · 2024-05-21T08:49:02Z

Wow, a 3x speed up would be impressive!

What I don't understand, and what I am wondering, is how did you know that the "for loop" version

function lagrange_poly(j,x_nodes,x)
    # get number of nodes
    n = size(x_nodes,1)
    # location where l(x0) = 1
    x0 = x_nodes[j]
    # evaluate polynomial
    poly = 1.0
    for i in 1:j-1
            poly *= (x - x_nodes[i])/(x0 - x_nodes[i])
    end
    for i in j+1:n
            poly *= (x - x_nodes[i])/(x0 - x_nodes[i])
    end
    return poly
end

would be slower than the version using Julia base functions

function lagrange_poly_optimised(other_nodes, one_over_denominator, x)
    return prod(x - n for n ∈ other_nodes) * one_over_denominator
end

In my previous life as a Fortran developer, I would have expected that there would be negligible difference between to two methods for a compiled language -- what am I missing?

Some timing checks would be helpful here to justify the changes (from the comments it looks like you are planning to do this anyway). If this method was generalised to an N-dimensional case I would be concerned that the memory requirement for storing the "other nodes" for every node might become an issue.

johnomotani · 2024-05-27T13:02:18Z

@mrhardman I don't think it was other changes that made the speedup. I didn't test separately, but the significant things are:

precompute the denominator, which gets rid of about half the arithmetic operations, and division is more expensive than multiplication/addition/subtraction.
combine the two loops (for less-than-j and greater-than-j) into a single loop, which should vectorise better.

I did also try a for-loop version of lagrange_poly_optimised(), which was essentially identical timing to the one using prod. prod was marginally faster (although the difference could have just been noise), and is more compact to write, so I chose to use that version.

I still think we'll want to do N 1d interpolations rather than one N-dimensional interpolation (although that has its own memory cost for at least one buffer array...), so I'm only anticipating 1d versions of "other nodes". It's true if we wanted an N-dimensional version, the memory usage for "other nodes" could become a concern!

moment_kinetics/src/fokker_planck_calculus.jl

johnomotani · 2024-05-27T20:01:17Z

Some timing checks would be helpful here to justify the changes

I did the one test on a case I happened to be running anyway. I was using the profiler (StatProfilerHTML) to measure how much time was spent in different parts of the code, so it's not too convenient to post the results. The speedup (factor of a few) was in line with what I'd expect, so I wasn't planning to do any more timing for this PR.

I did suggest a replacement for interpolate_2D_vspace!() that I think should be faster, but left it commented out because it would need testing for both correctness and performance, and it needs an additional pdf-sized buffer passing as an argument - I wasn't planning to include that change in this PR unless someone else has time to do the testing.

mrhardman · 2024-05-28T07:26:49Z

I did suggest a replacement for interpolate_2D_vspace!() that I think should be faster, but left it commented out because it would need testing for both correctness and performance, and it needs an additional pdf-sized buffer passing as an argument - I wasn't planning to include that change in this PR unless someone else has time to do the testing.

There is already an existing CI test for the `interpolate_2D_vspace!() function in the Fokker-Planck tests that you should be able to quickly modify if you want to include this here. You would be able to add a logical flag to the list of tests made a this line

moment_kinetics/moment_kinetics/test/fokker_planck_tests.jl

Line 75 in 73f8bf5

@testset " - test Lagrange-polynomial 2D interpolation" begin

, and then switch between my original interpolation and your optimised version at these lines

moment_kinetics/moment_kinetics/test/fokker_planck_tests.jl

Line 121 in 73f8bf5

interpolate_2D_vspace!(Fe_interp_ion_units,Fe,vpa,vperp,scalefac)

and

moment_kinetics/moment_kinetics/test/fokker_planck_tests.jl

Line 125 in 73f8bf5

interpolate_2D_vspace!(Fi_interp_electron_units,Fi,vpa,vperp,1.0/scalefac)

.

Hope this helps.

johnomotani added the performance label May 20, 2024

johnomotani force-pushed the optimise-Lagrange-interpolation branch from 1bd7e9a to d989adc Compare May 21, 2024 07:31

johnomotani added 2 commits May 21, 2024 09:00

Optimise 1d Gauss-Legendre interpolation

1e52a8e

Precalculate some quantities to minimise the amount of work needed to evaluate the Lagrange polynomials, and hopefully to make the loops vectorise better.

Use lagrange_poly_optimised in Fokker-Planck collision operator setup

d1459e5

johnomotani force-pushed the optimise-Lagrange-interpolation branch from d989adc to d1459e5 Compare May 21, 2024 08:01

johnomotani commented May 27, 2024

View reviewed changes

moment_kinetics/src/fokker_planck_calculus.jl Outdated Show resolved Hide resolved

Fix alternative interpolate_2D_vspace!() suggestion

589cf3e

Base automatically changed from gausslegendre-1d-interpolation to master June 4, 2024 15:25

johnomotani merged commit 6c09bb0 into master Jun 9, 2024
13 of 16 checks passed

johnomotani deleted the optimise-Lagrange-interpolation branch June 9, 2024 18:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimise 1d Gauss-Legendre interpolation #216

Optimise 1d Gauss-Legendre interpolation #216

johnomotani commented May 20, 2024

mrhardman commented May 21, 2024 •

edited

Loading

johnomotani commented May 27, 2024 •

edited

Loading

johnomotani commented May 27, 2024

mrhardman commented May 28, 2024

Optimise 1d Gauss-Legendre interpolation #216

Optimise 1d Gauss-Legendre interpolation #216

Conversation

johnomotani commented May 20, 2024

mrhardman commented May 21, 2024 • edited Loading

johnomotani commented May 27, 2024 • edited Loading

johnomotani commented May 27, 2024

mrhardman commented May 28, 2024

mrhardman commented May 21, 2024 •

edited

Loading

johnomotani commented May 27, 2024 •

edited

Loading