Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State and Line Performance Improvements #269

Merged

Conversation

AlexWKinley
Copy link
Contributor

This PR contains changes to a few time integrator and to lines with the goal of not changing the simulation is any meaningful way, while providing performance improvements.

Performance Results

Overall, for models of lines, there is roughly a 2x performance improvement.
Currently, the integrator performance improvements are mostly specific to RK2 and RK4.

RK2
RK4

line_performance_plots

Integrator / State Changes

The primary changes to the integrator and state code are to avoid memory allocations.

The state code that allows writing

r[1] = r[0] + rd[0] * (0.5 * dt);

is very convenient, but its current implementation means that every operation on a state allocates memory to create an entire copy of that state. That basic $y_{n+1} = y_n + dy * \Delta t$ line allocates memory to store the result of $dy * \Delta t$, and then the result of the addition, and even an additional allocation for the assignment.

My current solution to this is the new butcher_row function that can do these sorts of computations in place, without allocation any additional memory. Named as such because it does the math for a single row in a Butcher tableau.
With this function we get

// r[1] = r[0] + rd[0] * (0.5 * dt);
butcher_row<1>(r[1], r[0], { 0.5 * dt }, { &rd[0] });

Definitely not a clear and concise as the more mathematical notation, but with significant performance improvements.
Because of the additional complexity, I've only switch the RK2 and RK4 integrator to use this function. But I can certainly expand its usage if desired.

The other change to states I've made is to remove the empty constructor and destructor from StateVar and StateVarDeriv. Because of the rule of three/five defining a destructor (even an empty one), causes an expression like r[1] = r[0] + rd[0] to allocate twice, even though the result of r[0] + rd[0] can be directly assigned to r[1]. Removing the empty destructor allows the compiler to avoid that second allocation.

This should help other integrators even if they're not using butcher_row, but not as significantly as using butcher_row.

Line Changes

The first type of changes to lines are again to avoid memory allocations.

Changing line->getStateDeriv() to take references to the node velocities and accelerations avoids having to allocate memory for those results every time.
Also changing Line::setState(const std::vector<vec>& pos, const std::vector<vec>& vel) to take vector references avoids it allocation copies of the position and velocities vectors.

The other kind of change to lines are my attempts at simplifying the code while improving performance.
The biggest thing is getting rid of many of the

if (i == 0) 
    ....
else if (i == N)
   ....
else
   ....

by calculating the values that depend on whether it's an internal or external node once, allowing for them to be reused, and making the code nicer to read.

Misc

To improve consistency of the benchmarks, and avoid filling up the console with simulation times, I added MoorDyn::SetDisableOutput to disable some of the console and file output.

Logistical Notes

Feel free to share any thoughts/questions you have.
I know that those at NREL are doing some work from their side that could potentially create some conflicts with some of these changes. I'm happy to delay merging this and rebasing on top of whatever those changes may be myself if that would be easiest.

@sanguinariojoe
Copy link
Collaborator

sanguinariojoe commented Nov 13, 2024 via email

@AlexWKinley
Copy link
Contributor Author

@RyanDavies19
Just a friendly ping in case you haven't seen this.
I'm in no particular rush for this to be merged, but I do want to make sure that if I'm creating any conflicts with work you're doing, we coordinate on making sure those can be resolved.

source/State.hpp Outdated Show resolved Hide resolved
source/MoorDyn2.hpp Show resolved Hide resolved
source/State.hpp Show resolved Hide resolved
@AlexWKinley
Copy link
Contributor Author

From my vantage point, I'm good with this being merged whenever. But I'm also happy to leave this open if you'd prefer to merge it after other work, or if we want to wait for Jose to take a look.

Thanks for all the work you've been doing on MoorDyn @RyanDavies19.

@RyanDavies19
Copy link
Collaborator

RyanDavies19 commented Nov 26, 2024

This is good to merge on my end! Lets wait to see if @sanguinariojoe has anything to add.

@sanguinariojoe
Copy link
Collaborator

This is good to go with me. As a minor comment, it would be great to have the += operator overloaded to replace some calls. e.g.:

butcher_row<1>(r[0], r[0], { dt }, { &rd[1] });

@AlexWKinley
Copy link
Contributor Author

@sanguinariojoe

I agree it could be nice to better integrate with operator overloading. With how things currently work butcher_row<1>(r[0], r[0], { dt }, { &rd[1] }); would still be more performant than r[0] += rd[1] * dt. Since rd[1] * dt would itself allocate a new state to store the scaled derivatives in.

My future ideal would be for MoorDynState and MoorDynDState to subclass or internally wrap Eigen::ArrayXd. Because Eigen uses expression templates, things like r[0] = r[0] + (rd[0] + rd[3]) * (dt / 6.0) + (rd[1] + rd[2]) * (dt / 3.0); would avoid allocating and basically compile to the equivalent loop expression. But there's some complexity in terms of then being able to get individual object states, and the internal state code would get more complicated overall. So I've held off on attempting that, especially since there are other changes/additions to states in the pipeline from Ryan's work.

That being said, it's not hard to implement the operator+=, and if it would be useful I'm happy to add it. I just don't think any code currently uses it, since it's not currently defined.

@sanguinariojoe
Copy link
Collaborator

sanguinariojoe commented Dec 2, 2024 via email

@RyanDavies19
Copy link
Collaborator

Let's merge this as in then for now if it works for you @AlexWKinley. One last thing before that, can we change conn to point in the butcher_row function (see review comment above)?

@AlexWKinley
Copy link
Contributor Author

Oops, I somehow missed that review, my apologies. Should be fixed now. I'm good for this to be merged.

@RyanDavies19 RyanDavies19 merged commit eeb9bc3 into FloatingArrayDesign:dev Dec 4, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants