Skip to content

Commit

Permalink
update inertia estimation after reading recent Wensing papers
Browse files Browse the repository at this point in the history
  • Loading branch information
RussTedrake committed Dec 3, 2023
1 parent 4f65971 commit a930b2c
Showing 1 changed file with 168 additions and 68 deletions.
236 changes: 168 additions & 68 deletions sysid.html
Original file line number Diff line number Diff line change
Expand Up @@ -112,10 +112,10 @@ <h1><a href="index.html" style="text-decoration:none;">Underactuated Robotics</a
least-squares objective to capture the errors; more generally we will look
at fitting stochastic models to the data.</p>

<p>We often separate the identification procedure into two parts, were we
first estimate the state $\hat{\bx}_n$ given the input-output data $\bu_n,
\by_n$, and then focus on estimating the state-evolution dynamics in a
second step. The dynamics estimation algorithms fall into two main
<p>We often separate the identification procedure into two parts, in which
we first estimate the state $\hat{\bx}_n$ given the input-output data
$\bu_n, \by_n$, and then focus on estimating the state-evolution dynamics
in a second step. The dynamics estimation algorithms fall into two main
categories: <ul><li><i>Equation error</i> minimizes only the one-step
prediction error: $$\min_{\alpha} \sum_{n=0}^{N-2} \| f_\balpha(\hat\bx_n,
\bu_n) - \hat{\bx}_{n+1} \|^2_2.$$ </li><li><i>Simulation error</i>
Expand Down Expand Up @@ -257,13 +257,12 @@ <h1><a href="index.html" style="text-decoration:none;">Underactuated Robotics</a
offsets. This one can be a real nuisance in practice. Joint sensors can
slip, and some robots even use relative rotary encoders, and rely on
driving the joint to some known hard joint limit each time the robot is
powered on in order to obtain the offset. I've worked on one humanoid
powered on in order to obtain the offset. I've worked on one advanced
robot that had a quite elaborate and painful kinematic calibration
procedure which involve fitting additional hardware over the joints to
ensure they were in a known location and then running a script. Having a
an expensive and/or unreliable calibration procedure can put a damper on
any robotics project. For underactuated systems, in particular, it can
have a dramatic effect on performance.</p>
any robotics project.</p>

<example><h1>Acrobot balancing with calibration error</h1>

Expand Down Expand Up @@ -303,20 +302,156 @@ <h1><a href="index.html" style="text-decoration:none;">Underactuated Robotics</a

</subsection>

<subsection><h1>Least-squares formulation (of the inverse dynamics).</h1>
<subsection><h1>Estimating inertial parameters (and friction)</h1>

<p>Now let's thinking about estimating the dynamic parameters of
<p>Now let's thinking about estimating the inertial parameters of
multibody system. We've been writing the manipulation equations in the
form: \begin{equation}\bM({\bq})\ddot{\bq} + \bC(\bq,\dot{\bq})\dot\bq =
\btau_g(\bq) + \bB\bu + \text{friction, etc.}\end{equation} Each of the
terms in this equation can depend on the parameters $\balpha$ that we're
trying to estimate. But the parameters enter the multibody equations in
a particular structured way: the equations are
<i>affine in the <b>lumped parameters</b></i>. More precisely, the
manipulator equations above can be factored into the form $${\bf
W}(\bq,\dot{\bq}, \ddot{\bq}, \bu) \balpha_l(\balpha) +
\bw_0(\bq,\dot{\bq}, \ddot{\bq}, \bu) = 0,$$ where $\balpha_l$ are the
"lumped parameters". We sometimes refer to ${\bf W}$ as the "data matrix".</p>
a particular structured way: <i>the <b>inverse</b> dynamics equations are
affine in the inertial parameters</i> <elib>Atkeson86</elib>. Notably,
the forward dynamics (solving for $\ddot{\bq}$) involves a ${\bf
M}^{-1}(q)$; these dynamics are <i>not</i> affine in the parameters.
Generic system identification approaches which treat the multibody
dynamics as $\bx[n+1] = f(\bx[n], \bu[n])$ will not be leveraging this
convexity directly.</p>

<p>To proceed with the inverse dynamics formulation, let us first choose
a parameterization for the inertias. <elib>Wensing17a</elib> shows that
the natural parameterization of the inertial terms for a particular rigid
body, $B$, are in terms of the <i>density-weighted moment matrices</i>:
\begin{gather*} m_B &=& \iiint_{V_B} \rho(\bx)dV \in \Re \\ {\bf h}_B &=&
\iiint_{V_B} \bx \rho(\bx) dV \in \Re^3 \\ {\bf \Sigma}_B &=&
\iiint_{V_B} \bx \bx^T \rho(\bx)dV \in S_3, \end{gather*} where $V_B$ is
the volume of the geometry in body $B$, the integral notation is for the
3D volume integral, and $S_3$ is the group of 3x3 (real) symmetric
matrices. $m_B$ is the total mass of the body, ${\bf h}_B$ is simply the
mass times the position of the center of mass, and ${\bf \Sigma}_B$ is
closely related to the more standard form of the inertia matrix. With the
integral using $x$ represented in the body frame, $B$, we can extract
${}^B{\bf p}^{B_{com}} = \frac{1}{m}{\bf h}_B$ and ${\bf I}^{B/Bo} =
\tr({\bf \Sigma}_B)\mathbb{I}_{3 \times 3} - {\bf \Sigma}_B.$ Since ${\bf
\Sigma}_B$ is symmetric, we can summarize all of the inertial parameters
for body $B$ in 10 real numbers: $$\balpha_B = \begin{bmatrix} m_B, {\bf
h}_B^T, \mathrm{vec}_{\mathrm{sym}}({\bf \Sigma}_B) \end{bmatrix} \in
\Re^{10},$$ where $\mathrm{vec}_{\mathrm{sym}}$ takes the 6 upper
triangular elements of the 3x3 symmetric matrices into a vector.</p>

<p>Remarkably, the inverse dynamics are affine in these parameters. This
means that the manipulator equations above can be refactored into the
form \begin{equation}{\bf W}(\bq_n,\dot{\bq}_n, \ddot{\bq}_n, \bu_n)
\balpha_B + \bw_0(\bq_n,\dot{\bq}_n, \ddot{\bq}_n, \bu_n) = 0.
\label{eq:least_squares}\end{equation} We sometimes refer to the ${\bf
W}$ which is assembled by vertically stacking these equations evaluated
at all time samples, $n$, as the "data matrix". Given data trajectories
of positions, $\bq_n,$ and torque measurements, $\bu_n,$ we can estimate
$\dot{\bq}_n$ and $\ddot{\bq}_n$ with (acausal) filtering. Then we can
estimate the parameters using least-squares estimation. As such, we can
leverage all of the strong results and variants from linear estimation.
For instance, we can add terms to regularize the estimate (e.g. to stay
close to an initial guess), and we can write efficient recursive
estimators for optimal online estimation of the parameters using
recursive least-squares. My favorite recursive least-squares algorithm
uses incremental QR factorization<elib>Kaess08</elib>.</p>

<p>Not all values of $\alpha_B$ describe physically-valid inertias. For
instance, we know that $m_B$ has to be positive. There are constraints on
the inertia matrix itself. <elib>Wensing17a</elib> showed that these
constraints are exactly transcribed as positive-definiteness of the
"pseudo-inertia": $${\bf J}(\balpha_B) := \begin{bmatrix} {\bf \Sigma}_B
& {\bf h}_B \\ {\bf h}_B^T & m_B \end{bmatrix} \succ 0.$$ If we know the
geometry of the body, $B$, that we are estimating, then we can optionally
write additional convex constraints on these parameters. For instance, we
know that the center of mass must lie inside the convex hull of the
geometry: $${}^B{\bf p}^{B_{com}} \in \mathrm{conv}(V_B) \Rightarrow {\bf
h}_B \in m_B \, \mathrm{conv}(V_B).$$ This non-negative scaling of a
convex set is still a convex set (known as the "perspective" of the set).
<elib>Wensing17a</elib> also gives a convex constraint on the
second-moment matrix based on e.g. the minimum-volume Loewner-John
ellipsoid containing the known geometry.</p>

<p>Do these additional constraints matter? In the limit of infinite
data, then one might hope that the least-squares estimator would obtain
the correct values without the constraints. But their are two foils to
this ideal. First, we often wish to operate in a relatively low-data
regime. Second, it is important to understand that <i>not all of the
inertial parameters will be identifiable.</i> For instance, if you
imagine a robotic arm bolted to a table where the first link is a
rotation about the $z$ axis. Then no amount of data obtained from this
robot will given any information about the moments of inertia of the
first link about the $x$ and $y$ axes. Naively including these parameters
in the estimation problem could not only lead to poor estimates of the
unidentifiable parameters, but can even render the physical-validity
constraints on the inertial parameters vacuous.</p>

<todo>example of non-identifiable parameters making the physical-validity
constraints vacuous.</todo>

<p>There are two general approaches to dealing with the unidentifiable
parameters. First, we can exclude them from the optimization. There are a
number of algorithms for symbolically extracting the identifiable
parameters, but it is more common (and likely just as effective) to
simply evaluate the null-space of a sufficiently rich data matrix, $W$,
to find the unidentifiable parameters by using, e.g., $\bR_1\alpha_l$
from the QR factorization: $${\bf W} = \begin{bmatrix} \bQ_1 & \bQ_2
\end{bmatrix} \begin{bmatrix} \bR_1 \\ 0 \end{bmatrix}, \quad \Rightarrow
\quad {\bf W}\balpha_B = \bQ_1 (\bR_1 \balpha_B) = \bQ_1
\bar{\balpha}_{B},$$ where $\bar{\balpha}_B$ are the identifiable
parameters. To implement the physical-validity constraints, we would need
to use estimates of the unidentifiable parameters. Alternatively, one
can formulate the optimization directly on the full parameters and add a
regularization term which (ideally) operates in the nullspace of the
primary objective to drive the unidentifiable parameters towards a prior
estimate. <elib>Lee19</elib> gives a thorough treatment of this approach
and provides a convex formulation of a geometrically consistent
regularizer, which they call the <i>entropic divergence.</i></p>

<todo>I sent this note to Pat Wensing: "I feel that there is some
subtlety in combining the regularizer with the constrained optimization
formulation. Apologies if you addressed this and I missed it! The
intuitive hope for the regularizer is roughly that it's operating in the
null space of the primary objective to handle the unidentifiable
parameters. But having the constraints complicates that story quite a
bit, I think? For instance, what's to stop the optimization from making
the triangle inequalities almost vacuous in the non-identifiable
parameters by paying only the small cost of the regularization
term?"</todo>

<p>The least-squares inverse-dynamics formulation also allows one to
simultaneously fit a number of other relevant dynamic parameters which
enter linearly into the equations. These include reflected inertial (as a
diagonal matrix times $\ddot{\bq}$), joint friction (as a constant torque
multiplied by the sign of $\dot{\bq}$), and damping (as a linear multiple
of $\dot{\bq}$).</p>

<p>Allow me to mention one subtle point about numerical scaling. All of
the equations in the least-squares inverse-dynamics formulation
(\ref{eq:least_squares}) have units of torque. But the torques used on
the base link of a robotic arm might be numerically much larger than the
torques corresponding to the more distal joints. <elib>Lee19</elib>
addresses this by solving the unconstrained least-squares problem once,
then taking the error covariance of this solution into the objective
function for the constrained least-squares optimization. This serves to
make the least-squares objective coordinate independent.</p>

<example><h1>Inertial identification for a Kuka iiwa</h1>

</example>

</subsection>

<subsection><h1>Simultaneous kinematic and inertial identification via
lumped parameters.</h1>

<p>Although it is highly recommended to separate kinematic an inertial
estimation, some amount of joint estimation is possible when needed. We
know that the inverse dynamics are affine in the inertial parameters. But
it turns out the can also be written as affine in the "lumped
parameters": nonlinear combinations of inertial and kinematic which do
not depend on the data. I feel this is best explained with a few simple
examples.</p>

<example><h1>Lumped parameters for the simple pendulum</h1>

Expand All @@ -335,18 +470,12 @@ <h1><a href="index.html" style="text-decoration:none;">Underactuated Robotics</a
For instance, if I had terms in the equations of the form, $\sin(m
\theta)$, then I would <i>not</i> be able to produce an affine
decomposition separating $m$ from $\theta$. Fortunately, that doesn't
happen in our mechanical systems <elib>Khalil04</elib>. Furthermore,
this structure is particular to the <i>inverse dynamics</i>, as we have
happen in our mechanical systems <elib>Khalil04</elib>. Again, this
structure is particular to the <i>inverse dynamics</i>, as we have
written here. If you were to write the forward dynamics, multiplying by
$\bM^{-1}$ in order to solve for $\ddot{\bq}$, then once again you would
destroy this affine structure.</p>

<p>This is super interesting! It is tempting to thing about parameter
estimation for general dynamical systems in our standard state-space
form: $\bx[n+1] = f_\balpha(\bx[n], \bu[n]).$ But for multibody systems,
it seems that this would be the wrong thing to do, as it destroys this
beautiful affine structure.</p>

<example><h1>Multibody parameters in <drake></drake></h1>

<p>Very few robotics simulators have any way for you to access the
Expand Down Expand Up @@ -383,29 +512,12 @@ <h1><a href="index.html" style="text-decoration:none;">Underactuated Robotics</a

<p>The existence of the lumped-parameter decomposition reveals that the
<i>equation error</i> for lumped-parameter estimation, with the error
taken in the torque coordinates, can be solved using least squares. As
such, we can leverage all of the strong results and variants from linear
estimation. For instance, we can add terms to regularize the estimate
(e.g. to stay close to an initial guess), and we can write efficient
recursive estimators for optimal online estimation of the parameters
using recursive least-squares. My favorite recursive least-squares
algorithm uses incremental QR factorization<elib>Kaess08</elib>.</p>

<p>Importantly, because we have reduced this to a least-squares problem,
we can also understand when it will <i>not</i> work. In particular, it
is quite possible that some parameters cannot be estimated from any
amount of joint data taken on the robot. As a simple example, consider a
robotic arm bolted to a table; the inertial parameters of the first link
of the robot will not be identifiable from any amount of joint data. Even
on the second link, only the inertia relative to the first joint axis
will be identifiable; the inertial parameters corresponding to the other
dimensions will not. In our least-squares formulation, this is quite
easy to understand: we simply check the (column) rank of the data matrix,
${\bf W}$. In particular, we can extract the <b>identifiable lumped
parameters</b> by using, e.g., $\bR_1\alpha_l$ from the QR factorization:
$${\bf W} = \begin{bmatrix} \bQ_1 & \bQ_2 \end{bmatrix} \begin{bmatrix}
\bR_1 \\ 0 \end{bmatrix}, \quad \Rightarrow \quad {\bf W}\balpha_l =
\bQ_1 (\bR_1 \alpha_l).$$</p>
taken in the torque coordinates, can again be solved using least squares.
Importantly, because we have reduced this to a least-squares problem, we
can also understand when it will <i>not</i> work. Once again, not all of
the lumped-parameters are identifiable, so we will typically need a
second step to operate on the identifiable lumped parameters or to add a
regularization term.</p>

<example><h1>Parameter estimation for the Cart-Pole</h1>

Expand Down Expand Up @@ -438,25 +550,9 @@ <h1><a href="index.html" style="text-decoration:none;">Underactuated Robotics</a
parameters are precisely the subset of the lumped parameters that we
need.</p>

<p>For practical reasons, it might be convenient to take your estimates
of the lumped parameters, and try to back out the original parameters
(for instance, if you want to write them back out into a URDF file). For
this, I would recommend a final post-processing step that e.g. finds the
parameters $\hat{\balpha}$ that are as close as possible (e.g. in the
least-squares sense) to your original guess for the parameters, subject
to the nonlinear constraint that $\bR_1 \balpha_l(\hat{\balpha})$ matches
the estimated identifiable lumped parameters.
</p>

<p>There are still a few subtleties worth considering, such as how we
parameterize the inertial matrices. Direct estimation of the naive
parameterization, the six entries of a symmetric 3x3 matrix, can lead to
non-physical inertial matrices. <elib>Wensing17a</elib>
describes a parameter estimation formulation that includes a convex
formulation of the physicality constraints between these parameters.</p>

</subsection>


<subsection><h1>Identification using energy instead of inverse
dynamics.</h1>

Expand All @@ -481,13 +577,17 @@ <h1><a href="index.html" style="text-decoration:none;">Underactuated Robotics</a
velocities, but that joint accelerations, often obtained by numerically
differentiating <i>twice</i>, tend to be much more noisy. In some cases,
it might be numerically better to apply finite differences to the total
energy of the system rather than to the individual joints <sup>&dagger;</sup>.
<sidenote><sup>&dagger;</sup> Sometimes these methods are written as the numerical
integration of the power input, rather than the differentiation of the
total energy, but the numerics should be no different.</sidenote>
energy of the system rather than to the individual joints
<sup>&dagger;</sup>.
<sidenote><sup>&dagger;</sup> Sometimes these methods are written as the
numerical integration of the power input, rather than the differentiation
of the total energy, but the numerics should be no different.</sidenote>
<elib>Gautier96</elib> provides a study of various filtering formulations
which leads to a recommendation in <elib>Gautier97</elib>
that the energy formulation tends to be numerically superior.</p>
that the energy formulation tends to be numerically superior. This
distinction is probably more important in online parameter estimation,
where the filters must be causal, as opposed to offline parameter
estimation where acausal filters can be used.</p>

</subsection>

Expand Down

0 comments on commit a930b2c

Please sign in to comment.