State and Line Performance Improvements #269

AlexWKinley · 2024-11-06T20:18:09Z

This PR contains changes to a few time integrator and to lines with the goal of not changing the simulation is any meaningful way, while providing performance improvements.

Performance Results

Overall, for models of lines, there is roughly a 2x performance improvement.
Currently, the integrator performance improvements are mostly specific to RK2 and RK4.

Integrator / State Changes

The primary changes to the integrator and state code are to avoid memory allocations.

The state code that allows writing

r[1] = r[0] + rd[0] * (0.5 * dt);

is very convenient, but its current implementation means that every operation on a state allocates memory to create an entire copy of that state. That basic $y_{n+1} = y_n + dy * \Delta t$ line allocates memory to store the result of $dy * \Delta t$, and then the result of the addition, and even an additional allocation for the assignment.

My current solution to this is the new butcher_row function that can do these sorts of computations in place, without allocation any additional memory. Named as such because it does the math for a single row in a Butcher tableau.
With this function we get

// r[1] = r[0] + rd[0] * (0.5 * dt);
butcher_row<1>(r[1], r[0], { 0.5 * dt }, { &rd[0] });

Definitely not a clear and concise as the more mathematical notation, but with significant performance improvements.
Because of the additional complexity, I've only switch the RK2 and RK4 integrator to use this function. But I can certainly expand its usage if desired.

The other change to states I've made is to remove the empty constructor and destructor from StateVar and StateVarDeriv. Because of the rule of three/five defining a destructor (even an empty one), causes an expression like r[1] = r[0] + rd[0] to allocate twice, even though the result of r[0] + rd[0] can be directly assigned to r[1]. Removing the empty destructor allows the compiler to avoid that second allocation.

This should help other integrators even if they're not using butcher_row, but not as significantly as using butcher_row.

Line Changes

The first type of changes to lines are again to avoid memory allocations.

Changing line->getStateDeriv() to take references to the node velocities and accelerations avoids having to allocate memory for those results every time.
Also changing Line::setState(const std::vector<vec>& pos, const std::vector<vec>& vel) to take vector references avoids it allocation copies of the position and velocities vectors.

The other kind of change to lines are my attempts at simplifying the code while improving performance.
The biggest thing is getting rid of many of the

if (i == 0) 
    ....
else if (i == N)
   ....
else
   ....

by calculating the values that depend on whether it's an internal or external node once, allowing for them to be reused, and making the code nicer to read.

Misc

To improve consistency of the benchmarks, and avoid filling up the console with simulation times, I added MoorDyn::SetDisableOutput to disable some of the console and file output.

Logistical Notes

Feel free to share any thoughts/questions you have.
I know that those at NREL are doing some work from their side that could potentially create some conflicts with some of these changes. I'm happy to delay merging this and rebasing on top of whatever those changes may be myself if that would be easiest.

sanguinariojoe · 2024-11-13T08:41:26Z

Hey! I am out of office. I can give feedback by the end of the month

…

On Wed, 6 Nov 2024, 21:18 AlexWKinley, ***@***.***> wrote: This PR contains changes to a few time integrator and to lines with the goal of not changing the simulation is any meaningful way, while providing performance improvements. Performance Results Overall, for models of lines, there is roughly a 2x performance improvement. Currently, the integrator performance improvements are mostly specific to RK2 and RK4. RK2.png (view on web) <https://github.com/user-attachments/assets/108a05f6-7953-4a8e-944e-a1224a32f096> RK4.png (view on web) <https://github.com/user-attachments/assets/9a9b8380-648b-462b-9fe6-27a738f42479> line_performance_plots.png (view on web) <https://github.com/user-attachments/assets/eb3841dd-ce01-42f1-a011-5ce9780941b3> Integrator / State Changes The primary changes to the integrator and state code are to avoid memory allocations. The state code that allows writing r[1] = r[0] + rd[0] * (0.5 * dt); is very convenient, but its current implementation means that every operation on a state allocates memory to create an entire copy of that state. That basic $y_{n+1} = y_n + dy * \Delta t$ line allocates memory to store the result of $dy * \Delta t$, and then the result of the addition, and even an additional allocation for the assignment. My current solution to this is the new butcher_row function that can do these sorts of computations in place, without allocation any additional memory. Named as such because it does the math for a single row in a Butcher tableau <https://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods#Explicit_Runge.E2.80.93Kutta_methods> . With this function we get // r[1] = r[0] + rd[0] * (0.5 * dt); butcher_row<1>(r[1], r[0], { 0.5 * dt }, { &rd[0] }); Definitely not a clear and concise as the more mathematical notation, but with significant performance improvements. Because of the additional complexity, I've only switch the RK2 and RK4 integrator to use this function. But I can certainly expand its usage if desired. The other change to states I've made is to remove the empty constructor and destructor from StateVar and StateVarDeriv. Because of the rule of three/five <https://en.cppreference.com/w/cpp/language/rule_of_three> defining a destructor (even an empty one), causes an expression like r[1] = r[0] + rd[0] to allocate twice, even though the result of r[0] + rd[0] can be directly assigned to r[1]. Removing the empty destructor allows the compiler to avoid that second allocation. This should help other integrators even if they're not using butcher_row, but not as significantly as using butcher_row. Line Changes The first type of changes to lines are again to avoid memory allocations. Changing line->getStateDeriv() to take references to the node velocities and accelerations avoids having to allocate memory for those results every time. Also changing Line::setState(const std::vector<vec>& pos, const std::vector<vec>& vel) to take vector references avoids it allocation copies of the position and velocities vectors. The other kind of change to lines are my attempts at simplifying the code while improving performance. The biggest thing is getting rid of many of the if (i == 0) ....else if (i == N) ....else .... by calculating the values that depend on whether it's an internal or external node once, allowing for them to be reused, and making the code nicer to read. Misc To improve consistency of the benchmarks, and avoid filling up the console with simulation times, I added MoorDyn::SetDisableOutput to disable some of the console and file output. Logistical Notes Feel free to share any thoughts/questions you have. I know that those at NREL are doing some work from their side that could potentially create some conflicts with some of these changes. I'm happy to delay merging this and rebasing on top of whatever those changes may be myself if that would be easiest. ------------------------------ You can view, comment on, or merge this pull request online at: #269 Commit Summary - 3252969 <3252969> setup benchmark - 7fdc3c7 <7fdc3c7> butcher_row performance optimization - aaadbc8 <aaadbc8> Additional line performance improvements File Changes (12 files <https://github.com/FloatingArrayDesign/MoorDyn/pull/269/files>) - *M* bench/CMakeLists.txt <https://github.com/FloatingArrayDesign/MoorDyn/pull/269/files#diff-6d2f52fe94e40d1afde76a6c2285b4672c966380ebf25271ea83896eb2048c36> (1) - *A* bench/LinesBench.cpp <https://github.com/FloatingArrayDesign/MoorDyn/pull/269/files#diff-989eb8ce588a04a4607379cab01396fefa1a6be29fcc1c94a767f23062944eb1> (67) - *A* bench/LinesBench.hpp <https://github.com/FloatingArrayDesign/MoorDyn/pull/269/files#diff-fd6ef2ed65a4e067b86c422cbcd49db66815b4854224abcaa9509114a65bfb9f> (11) - *M* bench/MDBench.cpp <https://github.com/FloatingArrayDesign/MoorDyn/pull/269/files#diff-566c725804a08449bb7dc481b5f7ba8cf876644129e9d53ddc0ea6eb341bcc38> (4) - *A* bench/Mooring/cases/.gitignore <https://github.com/FloatingArrayDesign/MoorDyn/pull/269/files#diff-a2080354feaa91ddc711826257d21d4b6f03d25794e763da5993fd22d3255933> (1) - *A* bench/Mooring/cases/generate_cases.py <https://github.com/FloatingArrayDesign/MoorDyn/pull/269/files#diff-e68301d9e67caa59968fdb216171252117a3fc35bac705564b3a9a22a672647e> (92) - *M* source/Line.cpp <https://github.com/FloatingArrayDesign/MoorDyn/pull/269/files#diff-29898bc00510b247ddb019e5cb3b3c646b35ae51e549647eebcf010eb1899fa6> (207) - *M* source/Line.hpp <https://github.com/FloatingArrayDesign/MoorDyn/pull/269/files#diff-c2fdb05a0e46c53fafbecc1f181e856b352e1df28a102ad0918796ff13114056> (8) - *M* source/MoorDyn2.cpp <https://github.com/FloatingArrayDesign/MoorDyn/pull/269/files#diff-80a1bfa1212d3b21df426b512779592bdec270a00483d31731941ec8eab55ade> (14) - *M* source/MoorDyn2.hpp <https://github.com/FloatingArrayDesign/MoorDyn/pull/269/files#diff-01e39b64058932ab8f2c6cb3e898f1cfb304f5e65cad9195064b76f9df0aa248> (10) - *M* source/State.hpp <https://github.com/FloatingArrayDesign/MoorDyn/pull/269/files#diff-f19aeeae84b6c8854671709e13deac266863437e5c513bfaf930a699926cd947> (98) - *M* source/Time.cpp <https://github.com/FloatingArrayDesign/MoorDyn/pull/269/files#diff-aef669bd79784653c94dc8de36cbdd5725acb1c583c2133e81ca403cbf7ac511> (31) Patch Links: - https://github.com/FloatingArrayDesign/MoorDyn/pull/269.patch - https://github.com/FloatingArrayDesign/MoorDyn/pull/269.diff — Reply to this email directly, view it on GitHub <#269>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAMXKKBQIDCHWBYFFH3GWN3Z7J2RNAVCNFSM6AAAAABRJUOV6GVHI2DSMVQWIX3LMV43ASLTON2WKOZSGYZTSMJSGU4DKNA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

AlexWKinley · 2024-11-25T14:46:36Z

@RyanDavies19
Just a friendly ping in case you haven't seen this.
I'm in no particular rush for this to be merged, but I do want to make sure that if I'm creating any conflicts with work you're doing, we coordinate on making sure those can be resolved.

source/State.hpp

source/MoorDyn2.hpp

source/State.hpp

…e. Useful for the MATLAB wrapper

docs: add disable output to the docs and options list

AlexWKinley · 2024-11-26T16:48:59Z

From my vantage point, I'm good with this being merged whenever. But I'm also happy to leave this open if you'd prefer to merge it after other work, or if we want to wait for Jose to take a look.

Thanks for all the work you've been doing on MoorDyn @RyanDavies19.

RyanDavies19 · 2024-11-26T16:57:06Z

This is good to merge on my end! Lets wait to see if @sanguinariojoe has anything to add.

sanguinariojoe · 2024-11-27T11:00:24Z

This is good to go with me. As a minor comment, it would be great to have the += operator overloaded to replace some calls. e.g.:

butcher_row<1>(r[0], r[0], { dt }, { &rd[1] });

AlexWKinley · 2024-12-02T14:07:50Z

@sanguinariojoe

I agree it could be nice to better integrate with operator overloading. With how things currently work butcher_row<1>(r[0], r[0], { dt }, { &rd[1] }); would still be more performant than r[0] += rd[1] * dt. Since rd[1] * dt would itself allocate a new state to store the scaled derivatives in.

My future ideal would be for MoorDynState and MoorDynDState to subclass or internally wrap Eigen::ArrayXd. Because Eigen uses expression templates, things like r[0] = r[0] + (rd[0] + rd[3]) * (dt / 6.0) + (rd[1] + rd[2]) * (dt / 3.0); would avoid allocating and basically compile to the equivalent loop expression. But there's some complexity in terms of then being able to get individual object states, and the internal state code would get more complicated overall. So I've held off on attempting that, especially since there are other changes/additions to states in the pipeline from Ryan's work.

That being said, it's not hard to implement the operator+=, and if it would be useful I'm happy to add it. I just don't think any code currently uses it, since it's not currently defined.

sanguinariojoe · 2024-12-02T18:01:00Z

Up to you Alex. As I said, this can be merged as it is. You seem to have a plan, so I am totally ok with your design decisions. I personally tend to avoid allocations abusing on += and *=, and sometimes preallocating memory for + and *, overloading = as well to take advantage. But I did not dive as much as you on the Eigen way, so I trust on you. On top of that, you are usually the winning horse

…

On Mon, 2 Dec 2024, 15:08 AlexWKinley, ***@***.***> wrote: @sanguinariojoe <https://github.com/sanguinariojoe> I agree it could be nice to better integrate with operator overloading. With how things currently work butcher_row<1>(r[0], r[0], { dt }, { &rd[1] }); would still be more performant than r[0] += rd[1] * dt. Since rd[1] * dt would itself allocate a new state to store the scaled derivatives in. My future ideal would be for MoorDynState and MoorDynDState to subclass or internally wrap Eigen::ArrayXd. Because Eigen uses expression templates, things like r[0] = r[0] + (rd[0] + rd[3]) * (dt / 6.0) + (rd[1] + rd[2]) * (dt / 3.0); would avoid allocating and basically compile to the equivalent loop expression. But there's some complexity in terms of then being able to get individual object states, and the internal state code would get more complicated overall. So I've held off on attempting that, especially since there are other changes/additions to states in the pipeline from Ryan's work. That being said, it's not hard to implement the operator+=, and if it would be useful I'm happy to add it. I just don't think any code currently uses it, since it's not currently defined. — Reply to this email directly, view it on GitHub <#269 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAMXKKDTSE4ARQOOUIAHIK32DRSUXAVCNFSM6AAAAABRJUOV6GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMJRGY2DEMBWGE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

RyanDavies19 · 2024-12-03T22:04:49Z

Let's merge this as in then for now if it works for you @AlexWKinley. One last thing before that, can we change conn to point in the butcher_row function (see review comment above)?

AlexWKinley · 2024-12-04T14:04:32Z

Oops, I somehow missed that review, my apologies. Should be fixed now. I'm good for this to be merged.

AlexWKinley added 3 commits November 5, 2024 12:51

setup benchmark

3252969

butcher_row performance optimization

7fdc3c7

Additional line performance improvements

aaadbc8

RyanDavies19 reviewed Nov 25, 2024

View reviewed changes

source/State.hpp Outdated Show resolved Hide resolved

source/MoorDyn2.hpp Show resolved Hide resolved

source/State.hpp Show resolved Hide resolved

docs: add disable output to the docs and options list

0f2afd5

RyanDavies19 mentioned this pull request Nov 26, 2024

docs: add disable output to the docs and options list KelsonMarine/MoorDyn_Public#3

Merged

RyanDavies19 and others added 3 commits November 26, 2024 08:50

fix: removed case sensitivity in the options list keywords

0ac1e8e

feat: adds disableOutTime to turn off timestep printing to the consol…

327c2be

…e. Useful for the MATLAB wrapper

Merge pull request #3 from RyanDavies19/line_perf_improvement

374ab6e

docs: add disable output to the docs and options list

RyanDavies19 added the enhancement label Nov 26, 2024

cleanup: conn -> point

198079b

RyanDavies19 merged commit eeb9bc3 into FloatingArrayDesign:dev Dec 4, 2024
9 checks passed

RyanDavies19 mentioned this pull request Jan 14, 2025

WIP: Viscoelastic modeling and VIV for Lines #290

Draft

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

State and Line Performance Improvements #269

State and Line Performance Improvements #269

AlexWKinley commented Nov 6, 2024

sanguinariojoe commented Nov 13, 2024 via email

AlexWKinley commented Nov 25, 2024

AlexWKinley commented Nov 26, 2024

RyanDavies19 commented Nov 26, 2024 •

edited

Loading

sanguinariojoe commented Nov 27, 2024

AlexWKinley commented Dec 2, 2024

sanguinariojoe commented Dec 2, 2024 via email

RyanDavies19 commented Dec 3, 2024

AlexWKinley commented Dec 4, 2024

State and Line Performance Improvements #269

State and Line Performance Improvements #269

Conversation

AlexWKinley commented Nov 6, 2024

Performance Results

Integrator / State Changes

Line Changes

Misc

Logistical Notes

sanguinariojoe commented Nov 13, 2024 via email

AlexWKinley commented Nov 25, 2024

AlexWKinley commented Nov 26, 2024

RyanDavies19 commented Nov 26, 2024 • edited Loading

sanguinariojoe commented Nov 27, 2024

AlexWKinley commented Dec 2, 2024

sanguinariojoe commented Dec 2, 2024 via email

RyanDavies19 commented Dec 3, 2024

AlexWKinley commented Dec 4, 2024

RyanDavies19 commented Nov 26, 2024 •

edited

Loading