2 Multi-scale Procrustes shape control
2.1 Procrustes operator and shape error metric.
Functional maps (Melzi et al., 2019), based on the spectral analysis of the mesh through the Laplace-Beltrami operator, allow us to obtain a point-to-point match between current shape and target shape mesh nodes. We denote the current shape mesh nodes positions by \(\mathbf{x}_m\in \mathbb{R}^3, m=1,...,M\). These vectors are stacked in matrix \(\mathbf{X}\in\mathbb{R}^{3\times M}\). Each current shape point \(\mathbf{x}_m\) has an associated (matched) target point \(\mathbf{y}_m\in \mathbb{R}^3\). These target points are column-wise stacked in \(\mathbf{Y}\in \mathbb{R}^{3\times M}\).
For simplicity of notation, we define the Procrustes operator \((\mathbf{T},d_{\mathcal{P}})=\mathcal{P}(\mathbf{X},\mathbf{Y})\). This operator encloses the orthogonal Procrustes problem as it takes the two column-to-column matched sets of point coordinates \(\mathbf{X}, \mathbf{Y}\) and returns their Procrustes distance \(d_{\mathcal{P}}\) and the rigid transform \(\mathbf{T}(\mathbf{X},\mathbf{Y})\in SE(3)\) that minimises such distance: \[ \begin{aligned} d_{\mathcal{P}}(\mathbf{X},\mathbf{Y})= \begin{matrix} \\ \min \\ \mathbf{R} \end{matrix} \left \| \mathbf{R}(\mathbf{X}-\bar{\mathbf{X}})-(\mathbf{Y}-\bar{\mathbf{Y}}) \right \|_F \\ \text{s.t.} \, \mathbf{R}\in \text{SO}(3). \end{aligned} \tag{1} \] Matrices \(\bar{\mathbf{X}}\in\mathbb{R}^{3\times M}\) and \(\bar{\mathbf{Y}}\in\mathbb{R}^{3\times M}\) stack the column-wise mean \(\bar{\mathbf{x}},\bar{\mathbf{y}}\) (i.e. the centroid) of matrices \({\mathbf{X}},{\mathbf{Y}}\). Matrix \(\mathbf{R}\in \text{SO}(3)\) is the rotation component of \(\mathbf{T}\) and \(\mathbf{t}=\bar{\mathbf{y}}-\mathbf{R}\bar{\mathbf{x}}\) is the translation component.
We can apply the Procrustes operator \(\mathcal{P}(\mathbf{X}(t),\mathbf{Y})\) to obtain shape error \[ \tag{2} e(t)=d_{\mathcal{P}}(\mathbf{X}(t),\mathbf{Y}), \] which measures how similar shapes are (\(e(t)=0\) when two shapes are identical). The goal of our control strategy is to reduce the error metric \(e(t)\). Before initiating our control strategy, we apply \(\mathbf{T}^{-1}(t_0)\), obtained from \(\mathcal{P}(\mathbf{X}(t_0),\mathbf{Y})\), to our target shape as to bring it closer to our current shape in the 3D embedding.
2.2 Local-rigidity behaviour (LRB) hypothesis
2.3 Relaxed local-rigidity behaviour (LRB) analysis
In the diminishing rigidity concept introduced in (Berenson, 2013), an exponential decay of the material's rigidity with respect to gripper positions is assumed. However, an object may present diverse and time-varying behaviours depending on its shape and/or deformation state (e.g., a discontinuous rigidity function as in a mechanism). Our method does not assume any particular rigidity decay function on the object. Rather, our proposed relaxed LRB assumption allows us to evaluate and quantify on which scale (or topological distance) gripper actions are more effective in reducing the shape error.
Local-rigidity behaviour (LRB) is certainly met by points grabbed by the grippers (assuming grasping stability). However, the rest of the object points will most likely undergo deformations and thus not present LRB. For this reason, we base our control strategy on a relaxed assumption of LRB, i.e., we make use of a multi-scale analysis that quantifies how close to the LRB our object behaviour is for each analysed scale. In order to perform a multi-scale analysis, we establish scale \(r \in [ r_0 ,R(t)]\) being \(r_0\) the gripper's size and \(R(t)\) the largest topological distance that can be found in the object. Our analysis quantifies the extent to which sets \(\mathbf{X}_g(t,r)\) behave rigid-like under any action \(\mathbf{U}_g(t)\). If actions \(\mathbf{U}_g(t)\) affected \(\mathbf{X}(t)\) at scale \(r\) with ideal LRB and assuming linear action superposition (given object's isotropy and homogeneity), we could estimate the resulting shape points \(\hat{\mathbf{X}}(t,r)\) as: \[ \hat{\mathbf{x}}^h_m(t,r)=\frac{1}{G}\sum_{g}^{G}(\mathbf{U}_g(t))^{\delta_g(m,r)}\mathbf{x}^h_m(t,r) \tag{6} \] where function \(\delta_g(m,r)\) allows to disregard actions for points \(\mathbf{x}_m(t,r) \notin \Omega_g(t,r)\): \[ \delta_g(m,r):=\begin{cases} 1 \, & \text{if } {\mathbf{x}}_m(t,r) \in \Omega_g(t,r) \\ 0\, & \text{otherwise.} \end{cases} \] Using the Procrustes analysis \(\mathcal{P}(\mathbf{X}(t),\hat{\mathbf{X}}(t,r))\), we can obtain a measure \(w(t,r)\) of how much the object presents LRB at each scale \(r\) (i.e. at each topological distance \(r\) from the grippers) when undergoing gripper actions \(\mathbf{U}_g(t)\): \[ w(t,r)=1/\exp(\beta \, d_{\mathcal{P}}(\mathbf{X}(t),\hat{\mathbf{X}}(t,r))). \tag{7} \] Measure \(w(t,r)\in (0, 1]\), \(w(t,r)=1\) when the LRB is fully met (i.e. \(d_{\mathcal{P}}(\mathbf{X}(t),\hat{\mathbf{X}}(t,r))=0\)). Parameter \(\beta>0\) allows to modify the relaxation of the LRB assumption. Lower values of \(\beta\) imply a more relaxed rigidity assumption (e.g., if \(\beta=0.1\), almost every \(w(t,r)\approx 1\) and thus we consider every set \(\mathbf{X}_g(t,r), \forall r\in [ r_0 ,R(t)]\) to move rigidly, even if they do not). We propose using \(\beta \approx 1\times10^4\), which implies a conservative assumption of LRB.
2.4 Procrustes-based locally optimal scale estimation
Note that \(w(t,r)\) also quantifies the effectiveness with which the rigid error of \(\mathbf{X}_g(t,r)\) can be reduced by means of incremental transforms \(\mathbf{H}_g(t,r)\) as defined in (3) (larger \(w(t,r)\) implies more effectiveness). With this information, we seek to define gripper actions \(\mathbf{U}_g(t)\) such that they better benefit the global error reduction \(e(t)\). In order to define \(\mathbf{U}_g(t)\), we propose analysing scenarios in \(r \times r' \in \mathbb{R}\times\mathbb{R} : r,r' \in [ r_0 ,R(t)]\). These scenarios constitute an estimation of the object evolution if it was affected by actions \(\mathbf{U}_g(t)=\mathbf{H}_g(t,r)\) (defined at scale \(r\)) but presented ideal LRB at scale \(r'\). Each estimation \(\hat{\mathbf{X}}(t,r,r')\) is defined as: \[ \hat{\mathbf{x}}^h_m(t,r,r')=\frac{1}{G}\sum_{g}^{G}\mathcal{T}\{\prod_{t}^{t+\mathrm{d}t}(\mathbf{H}_g(t,r))^{\delta_g(m,r')}\}\mathbf{x}^h_m(t,r') \tag{8} \] In order to perform our analysis, we define an error increment estimation surface \[ \hat{e}(t,r,r')= d_{\mathcal{P}}(\hat{\mathbf{X}}(t,r,r'),\mathbf{Y})-e(t). \tag{9} \]
Surface \(\hat{e}(t,r,r')\) constitutes a continuous surface that provides an insight on the effectiveness and risks in reducing error \(e(t)\) by means of actions \(\mathbf{U}_g(t)=\mathbf{H}_g(t,r)\) (generated under the ideal LRB assumption). We can incorporate our knowledge on how much the LRB is met at each scale \(r'\) (i.e. \(w(t,r')\)) and define an error-reduction effectiveness \(q(t,r)\in \mathbb{R}\) which, for each estimated transform \(\mathbf{H}_g(t,r)\), yields: \[ q(t,r)=\int_{r_0}^{R}\hat{e}(t,r,r') w(t,r') \mathrm{d}r'. \tag{10} \] In (10), ideal error increment estimations \(\hat{e}(t,r,r')\) are weighted by the error reduction effectiveness \(w(t,r')\). In particular, \(q(t,r)\) estimates the global error increment that each \(\mathbf{H}_g(t,r)\) (3) can generate. The logical choice of \(r\) when defining gripper actions \(\mathbf{U}_g(t)=\mathbf{H}_g(t,r)\) would be the one that ensures the largest error reduction, i.e. \(\mathbf{U}_g(t)=\mathbf{H}_g(t,r^*(t)),\, r^*(t)={\arg \min}_r(q(t,r)) \). However, \(w(t,r)\) (present in the definition of \(q(t,r)\)) is computed based on the effects of previous actions defined according to \(\mathbf{U}_g(t)=\mathbf{H}_g(t,r^*(t))\). For \(q(t,r)\) to be reliable, updates in \(r^*(t)\) should take place in the locality of \(r^*(t)\). For this reason we define \(r^*(t)\) as: \[ r^*(t)=-\int_{t_0}^{t}\frac{\partial q(t,r^*(t))}{\partial r}+r^*(t_0), \tag{11} \] which updates \(r^*(t)\) in the direction of the \(\partial/\partial r\) component of the gradient for a given time instant \(t\) and thus generates a locally optimal \(r^*(t)\).
Note that (11) requires \(q(t,r^*(t))\) to be continuous differentiable with respect to \(r\). This implies \(e(t), w(t,r')\) and \(\hat{e}(t,r,r')\), and thus Procrustes transforms \(\mathbf{T}_g\) and distances \(d_{\mathcal{P}}\), should be continuous differentiable with respect to \(r\). The residual \(d_{\mathcal{P}}\) is continuous and differentiable as it constitutes a metric in shape space (Al-Aifari et al., 2013). On the other hand, showing that the Procrustes optimisation result \(\mathbf{T}_g\) is continuous and differentiable with respect to \(r\) requires more development that can be found in Appendix A.
We now provide some intuitions on impact of \(\beta\) in (7) on (11). When setting a low value for \(\beta\) (e.g., \(\beta=0.1\)), we assume that our LRB hypothesis holds even for actions and scales that lie far from our current actions \(\mathbf{U}_g(t)\) and scale \(r^*(t)\). Consequently, surface \(q(r,t)\) allows for a more unrestricted evolution of \(r^*(t)\), which could potentially lead to undesired system behaviours such as slower performance or, in the case of extremely low values of \(\beta\), to larger final error. On the other hand, a high \(\beta\) (e.g., \(\beta=1\times10^4\)) leads to a conservative \(q(r,t)\) surface that disregards estimates that lie far from the current actions and locally optimal scale. This conservative approach generates resistance to large changes in \(r^*(t)\), confining its movement to regions where the LRB hypothesis has been validated through measurements across iterations. Very large \(\beta\) values will not compromise the system's effectiveness, but they will lead to slower convergence as \(r^*(t)\) evolves more conservatively.