[multibody] Some speedups for inertia * vector #22287

sherm1 · 2024-12-09T22:49:36Z

Profiling the M-frame Inverse Dynamics in #22253 I found that a substantial fraction of the time was spent in the computation of the inertial force SpatialInertia*SpatialAcceleration. This PR makes some minor changes to the functions involved to make life easier on dumb compilers.

This tiny change speeds up the isolated (kinematics not included) ID-M calculation by 26% when compiled by gcc. It also speeds up the existing isolated ID-W calculation by 10% so is worth checking in separately. Including kinematics + ID the overall speedup is 7% for k + ID-M and 4% for k + ID-W.

This change is

sherm1

+@xuchenhan-tri for both reviews, please

Reviewable status: LGTM missing from assignee xuchenhan-tri(platform)

sherm1 · 2024-12-09T22:57:09Z

cc @amcastro-tri (low hanging fruit!)

xuchenhan-tri

Let's have another pair of eyes look at this if you don't mind.
-(status: single reviewer ok) +@SeanCurtis-TRI for platform if you can squeeze this in for today.

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: 3 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform)

multibody/tree/rotational_inertia.h line 424 at r1 (raw file):

  // TODO(Mitiguy) Issue #6145, add direct unit test for this method.
  Vector3<T> operator*(const Vector3<T>& w_E) const {
    // Eigen's symmetric multiply can be slow. Do this by hand instead:

BTW, have you considered adding an abstraction for symmetric 3x3 matrix to avoid doing these algebra in the inertia class? Seems like there are a few places in this class where we need to deal with the fact that only the triangular part of the matrix is used.

multibody/tree/rotational_inertia.h line 428 at r1 (raw file):

    //     [b  d  (e)] * [y] = [ bx+dy+ez ]
    //     [c  e   f ]   [z]   [ cx+ey+fz ]
    const T& a = I_SP_E_(0, 0);  // Access only lower triangle.

BTW, I looked at the eigen source code for symmetric matrix vector product and I'm not surprised that this is faster, but could you say, for reference, how much faster is this?

multibody/tree/spatial_inertia.cc line 450 at r1 (raw file):

  return SpatialForce<T>(
      // Rotational
      mass_ * (G_SP_E_ * alpha_WB_E + p_PScm_E_.cross(a_WBo_E)),

BTW, it's not clear to me what the goal of the change here is. Could you explain?

SeanCurtis-TRI

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: 3 unresolved discussions

multibody/tree/spatial_inertia.cc line 450 at r1 (raw file):

Previously, xuchenhan-tri wrote…

BTW, it's not clear to me what the goal of the change here is. Could you explain?

Are the factors not getting inlined? Or is the factoring of the mass scaling the savings? Or both?

I'd be surprised if it's a question of inlining, because the replaced methods are basically the ideal inlining case. However, this PR does refer to "stupid" compilers. Slightly stupid can't factor, but really stupid can't inline?

xuchenhan-tri · 2024-12-10T00:02:15Z

multibody/tree/spatial_inertia.cc line 450 at r1 (raw file):

Previously, SeanCurtis-TRI (Sean Curtis) wrote…

Are the factors not getting inlined? Or is the factoring of the mass scaling the savings? Or both?

I'd be surprised if it's a question of inlining, because the replaced methods are basically the ideal inlining case. However, this PR does refer to "stupid" compilers. Slightly stupid can't factor, but really stupid can't inline?

Also, the change kind of ruins the easy mapping from code to comment (e.g. p_BoBcmx <-> p_PScm_E_.cross).

sherm1

Thanks for the fast response, gentlemen. Comments addressed below.

Reviewable status: 1 unresolved discussion

multibody/tree/rotational_inertia.h line 424 at r1 (raw file):

Previously, xuchenhan-tri wrote…

BTW, have you considered adding an abstraction for symmetric 3x3 matrix to avoid doing these algebra in the inertia class? Seems like there are a few places in this class where we need to deal with the fact that only the triangular part of the matrix is used.

That's a good idea if there are other cases like this, but out of scope for me at the moment. This is the only case where I actually measured a performance impact. Added a TODO.

multibody/tree/rotational_inertia.h line 428 at r1 (raw file):

Previously, xuchenhan-tri wrote…

BTW, I looked at the eigen source code for symmetric matrix vector product and I'm not surprised that this is faster, but could you say, for reference, how much faster is this?

Yeah, it's terrible! Unfortunately I don't have a way to measure the performance of just this little piece of code. I have a good benchmark for its use within ID (cassiebench) so I know its a win but not by how much.

multibody/tree/spatial_inertia.cc line 450 at r1 (raw file):

not clear to me what the goal of the change here is

The previous code failed to factor out the mass -- it did 3 multiplies to get the weighted com + 6 multiplies to convert the unit inertia to a rotational inertia. Now it waits until we have produced the rotational acceleration vector, where we can scale with just the minimum of 3 multiplies for each of the rotational and translational vectors.

I'd be surprised if it's a question of inlining

Right! This was just suboptimal code that needed a tweak.

The function as a whole is not inlined (defined in cc). I was hoping to squeeze it down to something inlinable

p_BoBcmx <-> p_PScm_E

Good point. I added a comment to clarify. I removed the earlier aliasing to make it easier to see what's happening here numerically. I'm considering reworking this in highway if it is still dominating ID.

xuchenhan-tri

Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: complete! all discussions resolved, LGTM from assignees SeanCurtis-TRI(platform),xuchenhan-tri(platform)

multibody/tree/spatial_inertia.cc line 450 at r1 (raw file):

Previously, sherm1 (Michael Sherman) wrote…

not clear to me what the goal of the change here is

The previous code failed to factor out the mass -- it did 3 multiplies to get the weighted com + 6 multiplies to convert the unit inertia to a rotational inertia. Now it waits until we have produced the rotational acceleration vector, where we can scale with just the minimum of 3 multiplies for each of the rotational and translational vectors.

I'd be surprised if it's a question of inlining

Right! This was just suboptimal code that needed a tweak.

The function as a whole is not inlined (defined in cc). I was hoping to squeeze it down to something inlinable

p_BoBcmx <-> p_PScm_E

Good point. I added a comment to clarify. I removed the earlier aliasing to make it easier to see what's happening here numerically. I'm considering reworking this in highway if it is still dominating ID.

Ah ok, I counted wrong! Thanks for the explanation.

sherm1 added priority: medium type: performance release notes: none This pull request should not be mentioned in the release notes labels Dec 9, 2024

sherm1 marked this pull request as ready for review December 9, 2024 22:49

sherm1 changed the title ~~Some speedups for inertia * vector~~ [multibody] Some speedups for inertia * vector Dec 9, 2024

sherm1 added the status: single reviewer ok https://drake.mit.edu/reviewable.html label Dec 9, 2024

sherm1 assigned xuchenhan-tri Dec 9, 2024

sherm1 commented Dec 9, 2024

View reviewed changes

xuchenhan-tri removed the status: single reviewer ok https://drake.mit.edu/reviewable.html label Dec 9, 2024

xuchenhan-tri assigned SeanCurtis-TRI Dec 9, 2024

xuchenhan-tri reviewed Dec 9, 2024

View reviewed changes

SeanCurtis-TRI reviewed Dec 9, 2024

View reviewed changes

Some speedups for inertia * vector

12cd6de

sherm1 force-pushed the spatial_inertia_times_vector_speedup branch from 32136dd to 12cd6de Compare December 10, 2024 00:22

sherm1 commented Dec 10, 2024

View reviewed changes

xuchenhan-tri reviewed Dec 10, 2024

View reviewed changes

sherm1 merged commit 4ee5832 into RobotLocomotion:master Dec 10, 2024
9 of 10 checks passed

RussTedrake pushed a commit to RussTedrake/drake that referenced this pull request Dec 15, 2024

Some speedups for inertia * vector (RobotLocomotion#22287)

3e05acc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[multibody] Some speedups for inertia * vector #22287

[multibody] Some speedups for inertia * vector #22287

sherm1 commented Dec 9, 2024 •

edited

Loading

sherm1 left a comment

sherm1 commented Dec 9, 2024

xuchenhan-tri left a comment

SeanCurtis-TRI left a comment

xuchenhan-tri commented Dec 10, 2024

sherm1 left a comment

xuchenhan-tri left a comment

[multibody] Some speedups for inertia * vector #22287

[multibody] Some speedups for inertia * vector #22287

Conversation

sherm1 commented Dec 9, 2024 • edited Loading

sherm1 left a comment

Choose a reason for hiding this comment

sherm1 commented Dec 9, 2024

xuchenhan-tri left a comment

Choose a reason for hiding this comment

SeanCurtis-TRI left a comment

Choose a reason for hiding this comment

xuchenhan-tri commented Dec 10, 2024

sherm1 left a comment

Choose a reason for hiding this comment

xuchenhan-tri left a comment

Choose a reason for hiding this comment

sherm1 commented Dec 9, 2024 •

edited

Loading