2 Multi-scale Procrustes shape control

In this section we present a shape control strategy that, for each gripper \(g=1,...,G\), defines a 6 DoF gripper control action expressed as a rigid transform \(\mathbf{U}_g\in SE(3)\) (composed by translation action in \(\mathbf{u}_g\in\mathbb{R}^3\) and rotational action in Euler angles \(\omega\in\mathbb{R}^3\)).

2.1 Procrustes operator and shape error metric.

Functional maps (Melzi et al., 2019), based on the spectral analysis of the mesh through the Laplace-Beltrami operator, allow us to obtain a point-to-point match between current shape and target shape mesh nodes. We denote the current shape mesh nodes positions by \(\mathbf{x}_m\in \mathbb{R}^3, m=1,...,M\). These vectors are stacked in matrix \(\mathbf{X}\in\mathbb{R}^{3\times M}\). Each current shape point \(\mathbf{x}_m\) has an associated (matched) target point \(\mathbf{y}_m\in \mathbb{R}^3\). These target points are column-wise stacked in \(\mathbf{Y}\in \mathbb{R}^{3\times M}\).

For simplicity of notation, we define the Procrustes operator \((\mathbf{T},d_{\mathcal{P}})=\mathcal{P}(\mathbf{X},\mathbf{Y})\). This operator encloses the orthogonal Procrustes problem as it takes the two column-to-column matched sets of point coordinates \(\mathbf{X}, \mathbf{Y}\) and returns their Procrustes distance \(d_{\mathcal{P}}\) and the rigid transform \(\mathbf{T}(\mathbf{X},\mathbf{Y})\in SE(3)\) that minimises such distance: \[ \begin{aligned} d_{\mathcal{P}}(\mathbf{X},\mathbf{Y})= \begin{matrix} \\ \min \\ \mathbf{R} \end{matrix} \left \| \mathbf{R}(\mathbf{X}-\bar{\mathbf{X}})-(\mathbf{Y}-\bar{\mathbf{Y}}) \right \|_F \\ \text{s.t.} \, \mathbf{R}\in \text{SO}(3). \end{aligned} \tag{1} \] Matrices \(\bar{\mathbf{X}}\in\mathbb{R}^{3\times M}\) and \(\bar{\mathbf{Y}}\in\mathbb{R}^{3\times M}\) stack the column-wise mean \(\bar{\mathbf{x}},\bar{\mathbf{y}}\) (i.e. the centroid) of matrices \({\mathbf{X}},{\mathbf{Y}}\). Matrix \(\mathbf{R}\in \text{SO}(3)\) is the rotation component of \(\mathbf{T}\) and \(\mathbf{t}=\bar{\mathbf{y}}-\mathbf{R}\bar{\mathbf{x}}\) is the translation component.

We can apply the Procrustes operator \(\mathcal{P}(\mathbf{X}(t),\mathbf{Y})\) to obtain shape error \[ \tag{2} e(t)=d_{\mathcal{P}}(\mathbf{X}(t),\mathbf{Y}), \] which measures how similar shapes are (\(e(t)=0\) when two shapes are identical). The goal of our control strategy is to reduce the error metric \(e(t)\). Before initiating our control strategy, we apply \(\mathbf{T}^{-1}(t_0)\), obtained from \(\mathcal{P}(\mathbf{X}(t_0),\mathbf{Y})\), to our target shape as to bring it closer to our current shape in the 3D embedding.

2.2 Local-rigidity behaviour (LRB) hypothesis

Consider we were only focusing on the surface points that lie within a topological distance \(r\) from a gripper \(g\), i.e. a set of \(M_g\) object points \(\mathbf{X}_g(t,r)\in\mathbb{R}^{3\times M_g}\) defined by points \(\mathbf{x}_m(t,r)\in \Omega_g(t,r)\), where \(\Omega_g(t,r)\) is the object's surface open domain defined by a geodesic ball of radius \(r\) centred at gripper \(g\). Suppose the object's rigidity (unknown for us) allows for points \(\mathbf{X}_g(t,r)\) to move on a rigid manner under small gripper transforms \(\mathbf{H}_g\), i.e. \(\mathbf{H}_g\approx \mathbf{I}_{4}\) (being \( \mathbf{I}_{4}\) the \(4\times 4\) identity matrix), and the rest of points in \(\mathbf{X}(t)\), i.e. \(\mathbf{x}_m(t) \notin \Omega_g(t,r)\), remain unaffected by \(\mathbf{H}_g\). We denote this rigid behaviour as local-rigidity behaviour (LRB for now on). In this scenario, one could benefit from a Procrustes analysis \(\mathcal{P}(\mathbf{X}_g(t,r),\mathbf{Y}_g(r))\), where \(\mathbf{Y}_g(r)\in \mathbf{Y}\) are the points matched to those of \(\mathbf{X}_g(t,r)\). The transform \(\mathbf{T}_g(t,r)\) from the Procrustes analysis can be used to define a incremental transform \[ \mathbf{H}_g(t,r)=\exp{(\Delta t \log{(\mathbf{T}_g(t,r)}))}. \tag{3} \] where \(\mathbf{H}_g(t,r)\) belongs to the geodesic path (in \(\text{SE}(3)\)) defined from \(\mathbf{I}_{4}\) towards \(\mathbf{T}_g(t,r)\). This path is parameterised by time step \(\Delta t \in \mathbb{R}, \,\Delta t \in[0,1]\) which, when taking low values, generates \(\mathbf{H}_g(t,r)\) that meet the small action requirement for local rigid behaviour (LRB). Therefore, the rigid error reduction of the subset \(\mathbf{X}_g(t,r)\) can be performed with actions \(\mathbf{U}_g(t)=\mathbf{H}_g(t,r)\) and thus the global Procrustes residual of the whole set \(\mathbf{X}(t)\) with respect to \(\mathbf{Y}\), i.e. \(e(t)\), can be reduced too. Note that, as \(\Delta t \rightarrow 0\), (3) defines the state equation of \(\mathbf{X}_g(t,r)\): \[ \tag{4} \mathbf{X}^h_g(t,r)=\mathcal{T}\{ \prod_{t_0}^{t}e^{\log(\mathbf{T}_g(t,r))\mathrm{d}t}\}\mathbf{X}^h_g(t_0,r), \] where \(h\) denotes homogeneous coordinates and \(\mathcal{T}\) operates on the product integral generating a time-ordered product \(\mathcal{T}\{ \prod_{t_0}^{t}f(t)\}=f(t)f(t-\mathrm{d}t)\cdots f(t_0+\mathrm{d}t)f(t_0)\). Equation (4) constitutes a solution for: \[ \tag{5} \frac{\mathrm{d}\mathbf{X}^h_g(t,r)}{\mathrm{d}t}=\log(\mathbf{T}_g(t,r))\mathbf{X}^h_g(t,r), \] which defines the time derivative of points when they are only affected by gripper \(g\).

2.3 Relaxed local-rigidity behaviour (LRB) analysis

In the diminishing rigidity concept introduced in (Berenson, 2013), an exponential decay of the material's rigidity with respect to gripper positions is assumed. However, an object may present diverse and time-varying behaviours depending on its shape and/or deformation state (e.g., a discontinuous rigidity function as in a mechanism). Our method does not assume any particular rigidity decay function on the object. Rather, our proposed relaxed LRB assumption allows us to evaluate and quantify on which scale (or topological distance) gripper actions are more effective in reducing the shape error.

Local-rigidity behaviour (LRB) is certainly met by points grabbed by the grippers (assuming grasping stability). However, the rest of the object points will most likely undergo deformations and thus not present LRB. For this reason, we base our control strategy on a relaxed assumption of LRB, i.e., we make use of a multi-scale analysis that quantifies how close to the LRB our object behaviour is for each analysed scale. In order to perform a multi-scale analysis, we establish scale \(r \in [ r_0 ,R(t)]\) being \(r_0\) the gripper's size and \(R(t)\) the largest topological distance that can be found in the object. Our analysis quantifies the extent to which sets \(\mathbf{X}_g(t,r)\) behave rigid-like under any action \(\mathbf{U}_g(t)\). If actions \(\mathbf{U}_g(t)\) affected \(\mathbf{X}(t)\) at scale \(r\) with ideal LRB and assuming linear action superposition (given object's isotropy and homogeneity), we could estimate the resulting shape points \(\hat{\mathbf{X}}(t,r)\) as: \[ \hat{\mathbf{x}}^h_m(t,r)=\frac{1}{G}\sum_{g}^{G}(\mathbf{U}_g(t))^{\delta_g(m,r)}\mathbf{x}^h_m(t,r) \tag{6} \] where function \(\delta_g(m,r)\) allows to disregard actions for points \(\mathbf{x}_m(t,r) \notin \Omega_g(t,r)\): \[ \delta_g(m,r):=\begin{cases} 1 \, & \text{if } {\mathbf{x}}_m(t,r) \in \Omega_g(t,r) \\ 0\, & \text{otherwise.} \end{cases} \] Using the Procrustes analysis \(\mathcal{P}(\mathbf{X}(t),\hat{\mathbf{X}}(t,r))\), we can obtain a measure \(w(t,r)\) of how much the object presents LRB at each scale \(r\) (i.e. at each topological distance \(r\) from the grippers) when undergoing gripper actions \(\mathbf{U}_g(t)\): \[ w(t,r)=1/\exp(\beta \, d_{\mathcal{P}}(\mathbf{X}(t),\hat{\mathbf{X}}(t,r))). \tag{7} \] Measure \(w(t,r)\in (0, 1]\), \(w(t,r)=1\) when the LRB is fully met (i.e. \(d_{\mathcal{P}}(\mathbf{X}(t),\hat{\mathbf{X}}(t,r))=0\)). Parameter \(\beta>0\) allows to modify the relaxation of the LRB assumption. Lower values of \(\beta\) imply a more relaxed rigidity assumption (e.g., if \(\beta=0.1\), almost every \(w(t,r)\approx 1\) and thus we consider every set \(\mathbf{X}_g(t,r), \forall r\in [ r_0 ,R(t)]\) to move rigidly, even if they do not). We propose using \(\beta \approx 1\times10^4\), which implies a conservative assumption of LRB.

2.4 Procrustes-based locally optimal scale estimation

Note that \(w(t,r)\) also quantifies the effectiveness with which the rigid error of \(\mathbf{X}_g(t,r)\) can be reduced by means of incremental transforms \(\mathbf{H}_g(t,r)\) as defined in (3) (larger \(w(t,r)\) implies more effectiveness). With this information, we seek to define gripper actions \(\mathbf{U}_g(t)\) such that they better benefit the global error reduction \(e(t)\). In order to define \(\mathbf{U}_g(t)\), we propose analysing scenarios in \(r \times r' \in \mathbb{R}\times\mathbb{R} : r,r' \in [ r_0 ,R(t)]\). These scenarios constitute an estimation of the object evolution if it was affected by actions \(\mathbf{U}_g(t)=\mathbf{H}_g(t,r)\) (defined at scale \(r\)) but presented ideal LRB at scale \(r'\). Each estimation \(\hat{\mathbf{X}}(t,r,r')\) is defined as: \[ \hat{\mathbf{x}}^h_m(t,r,r')=\frac{1}{G}\sum_{g}^{G}\mathcal{T}\{\prod_{t}^{t+\mathrm{d}t}(\mathbf{H}_g(t,r))^{\delta_g(m,r')}\}\mathbf{x}^h_m(t,r') \tag{8} \] In order to perform our analysis, we define an error increment estimation surface \[ \hat{e}(t,r,r')= d_{\mathcal{P}}(\hat{\mathbf{X}}(t,r,r'),\mathbf{Y})-e(t). \tag{9} \]

Surface \(\hat{e}(t,r,r')\) constitutes a continuous surface that provides an insight on the effectiveness and risks in reducing error \(e(t)\) by means of actions \(\mathbf{U}_g(t)=\mathbf{H}_g(t,r)\) (generated under the ideal LRB assumption). We can incorporate our knowledge on how much the LRB is met at each scale \(r'\) (i.e. \(w(t,r')\)) and define an error-reduction effectiveness \(q(t,r)\in \mathbb{R}\) which, for each estimated transform \(\mathbf{H}_g(t,r)\), yields: \[ q(t,r)=\int_{r_0}^{R}\hat{e}(t,r,r') w(t,r') \mathrm{d}r'. \tag{10} \] In (10), ideal error increment estimations \(\hat{e}(t,r,r')\) are weighted by the error reduction effectiveness \(w(t,r')\). In particular, \(q(t,r)\) estimates the global error increment that each \(\mathbf{H}_g(t,r)\) (3) can generate. The logical choice of \(r\) when defining gripper actions \(\mathbf{U}_g(t)=\mathbf{H}_g(t,r)\) would be the one that ensures the largest error reduction, i.e. \(\mathbf{U}_g(t)=\mathbf{H}_g(t,r^*(t)),\, r^*(t)={\arg \min}_r(q(t,r)) \). However, \(w(t,r)\) (present in the definition of \(q(t,r)\)) is computed based on the effects of previous actions defined according to \(\mathbf{U}_g(t)=\mathbf{H}_g(t,r^*(t))\). For \(q(t,r)\) to be reliable, updates in \(r^*(t)\) should take place in the locality of \(r^*(t)\). For this reason we define \(r^*(t)\) as: \[ r^*(t)=-\int_{t_0}^{t}\frac{\partial q(t,r^*(t))}{\partial r}+r^*(t_0), \tag{11} \] which updates \(r^*(t)\) in the direction of the \(\partial/\partial r\) component of the gradient for a given time instant \(t\) and thus generates a locally optimal \(r^*(t)\).

Note that (11) requires \(q(t,r^*(t))\) to be continuous differentiable with respect to \(r\). This implies \(e(t), w(t,r')\) and \(\hat{e}(t,r,r')\), and thus Procrustes transforms \(\mathbf{T}_g\) and distances \(d_{\mathcal{P}}\), should be continuous differentiable with respect to \(r\). The residual \(d_{\mathcal{P}}\) is continuous and differentiable as it constitutes a metric in shape space (Al-Aifari et al., 2013). On the other hand, showing that the Procrustes optimisation result \(\mathbf{T}_g\) is continuous and differentiable with respect to \(r\) requires more development that can be found in Appendix A.

We now provide some intuitions on impact of \(\beta\) in (7) on (11). When setting a low value for \(\beta\) (e.g., \(\beta=0.1\)), we assume that our LRB hypothesis holds even for actions and scales that lie far from our current actions \(\mathbf{U}_g(t)\) and scale \(r^*(t)\). Consequently, surface \(q(r,t)\) allows for a more unrestricted evolution of \(r^*(t)\), which could potentially lead to undesired system behaviours such as slower performance or, in the case of extremely low values of \(\beta\), to larger final error. On the other hand, a high \(\beta\) (e.g., \(\beta=1\times10^4\)) leads to a conservative \(q(r,t)\) surface that disregards estimates that lie far from the current actions and locally optimal scale. This conservative approach generates resistance to large changes in \(r^*(t)\), confining its movement to regions where the LRB hypothesis has been validated through measurements across iterations. Very large \(\beta\) values will not compromise the system's effectiveness, but they will lead to slower convergence as \(r^*(t)\) evolves more conservatively.

2.5 Control strategy

Our control strategy makes use of the Procrustes action defined at scale \(r^*(t)\), i.e. \(\mathbf{T}_g(t,r^*(t))\). We defined our control law as the new term in the time-ordered product of (4): \[ \mathbf{U}_g(t)=\exp{(\log{(\mathbf{T}_g(t,r^*(t))})\mathrm{d}t)}. \tag{12} \] This results in the state equation of \(\mathbf{X}(t)\) (for each individual \(\mathbf{x}_m(t) \in \mathbf{X}(t)\)) \[ {\mathbf{x}}^h_m(t)=\frac{1}{G}\sum_{g}^{G}\mathcal{T}\{\prod_{t_0}^{t}(\mathbf{U}_g(t))^{\delta_g(m,r^*(t))}\}\mathbf{x}^h_m(t_0,r^*(t_0)), \tag{13} \] with \(r^*(t)\) updated as in (11). Note that the update rule in (11) needs an initial value of \(r^*(t_0)\) and \(w(t_0,r')\). We propose \(w(t_0,r')=1\,\forall r'\), which is equivalent to assuming equal LRB at all scales \(r'\). Our initial estimation of \(r'\) takes the minimum at the initial time instant, that is \(r^*(t_0)={\arg \min}_r(q(t_0,r))\). Note that, for the update of \(r^*(t)\), the partial derivative of \(q(t,r^*(t))\) in (11) needs to be evaluated only at \(r^*(t)\). This avoids the need to compute \(q(t,r)\) in (10) for all \(r\) except for the neighbourhood of \(r^*(t)\), contributing to the cost-effectiveness of our method. Furthermore, the rest of the method relies on matrix operations and SVD decomposition, further enhancing its low-cost nature, as it will be illustrated in the experiments section.