2 Geometry-based deformation energy

2.1 Proposed multi-scale normal (MSN) energy: $E_{\mathrm{MSN}}$

In this section we present a geometry-based deformation energy for OCSC. In texture-less objects, the only visual indicators of objects undergoing deformations are the local changes in length and curvature of their contour. We therefore define our analysis using these two key elements.

Our proposed energy relies on a shape representation analogous to the discrete multi-scale Laplacian descriptors presented in (Cuiral-Zueco and López-Nicolás, 2021) for elastic contour mapping. We define a continuous multi-scale normal (MSN) shape representation that provides a notion of multi-scale mean curvature. We make use of geodesic balls \(B(s,r)\subseteq \gamma(s)\) centred at \(s\) with radius \(r\in(0,r_{\rm{max}}]\subset\mathbb{R},\) (Fig. 2), being \(r_{\rm{max}}=l(t)/2\) and \(l(t)\in\mathbb{R}\) the total contour length. In this paper we will refer to radius \(r\) as scale of analysis (or simply scale). A scale defines the neighbouring region (i.e., the scope of analysis) at a given point \(s\). Therefore, our proposed shape representation \(\mathbf{l}(s,r)\in\mathbb{R}^2\) is obtained as \[ \mathbf{l}(s,r)=-\frac{1}{{r}}\int_{B(s,r)}\mathbf{x}^{\Re(s)}(s')\mathrm{d}l, \tag{1} \] where \(\mathbf{x}^{\Re(s)}(s')=(x^{\Re(s)}(s'),y^{\Re(s)}(s'))^\intercal\in\mathbb{R}^2\) are the extrinsic position vectors of points corresponding to parameter values \(s'\in B(s,r)\) and expressed in the reference \(\Re(s)\). Element \(\mathrm{d}l\) in (1) is the differential element of contour length \(l(t)\). As \(r\rightarrow 0\), (1) leads to the curvature weighted normal vectors (referred to \(\Re(s)\)). Coordinates of points within the integration domain are still expressed from frame \(\Re(s)\not\equiv \Re(s')\) and, therefore, descriptors \(\mathbf{l}(s,r)\) constitute both intrinsic and extrinsic descriptors (Bronstein et al., 2007).

Consider \(\bar{s}=\Pi(s,t)\) where \(\Pi(s,t):\gamma(t)\rightarrow \bar{\gamma}\) is an elastic map (diffeomorphism) that maximises multi-scale curvature similarity between contours. Using the shape representation (1), we define shape error \(\mathbf{e}_{\mathrm{MSN}}\) as \[ \mathbf{e}_{\mathrm{MSN}}(s,r,t)=\mathbf{l}(s,r,t)-\bar{\mathbf{l}}(\Pi(s,t),r), \tag{2} \] where \(\bar{\mathbf{l}}(\Pi(s,t),r)\) constitutes the multi-scale normal (MSN) descriptor of the target shape at point \(\Pi(s,t)=\bar{s}\). Using error (2) we introduce our proposed geometry-based deformation energy as: \[ E_{\mathrm{MSN}}(\gamma(t),\bar{\gamma})=\begin{matrix} \\ \text{min} \\ \Pi \end{matrix} \int_{0}^{l(t)}\int_{0}^{r_{\rm{max}}}\mathbf{e}_{\mathrm{MSN}}^\intercal\mathbf{e}^{}_{\mathrm{MSN}}\mathrm{d}r\mathrm{d}s, \tag{3} \] where \(l(t)\) is the length of curve \(\gamma(t)\). The optimisation process for obtaining \(\Pi\) in 1D domains can be performed by means of the Fast Marching Method (as proposed in (Cuiral-Zueco and López-Nicolás, 2021)).

2.2 Characteristics of $E_{\mathrm{MSN}}$ as a shape metric

Energy \(E_{\mathrm{MSN}}\) quantifies differences between shapes through their geometric features as it constitutes a metric in shape space as defined in (Al-Aifari et al., 2013). That is, considering three shapes \(\gamma_1, \gamma_2\) and \(\gamma_3\):

\[ \begin{split} E_{\mathrm{MSN}}(\gamma_1,\gamma_2)=E_{\mathrm{MSN}}(\gamma_2,\gamma_1), \\ E_{\mathrm{MSN}}(\gamma_1,\gamma_2)=0\,\Rightarrow \gamma_1\equiv \gamma_2, \\ E_{\mathrm{MSN}}(\gamma_1,\gamma_3)\leq E_{\mathrm{MSN}}(\gamma_1,\gamma_2)+E_{\mathrm{MSN}}(\gamma_2,\gamma_3). \end{split} \]

Some relevant features of metric \(E_{\mathrm{MSN}}\) for OCSC are:

2.2.1 Invariance to SE(2)

\(E_{\mathrm{MSN}}\) decouples rigid motions from shape-defining characteristics. Our analysis is performed from local frames \(\Re(s)\) (and \(\bar{\Re}(\bar{s})\)) thus making errors (2) locally invariant to SE(2) and energy (3) globally invariant to SE(2). In the literature, a common approach for ensuring SE(2) invariance involves using the Procrustes rigid transform (López-Nicolás et al., 2020; Al-Aifari et al., 2013; Jermyn et al., 2012). However, a global rigid adjustment of the target shape may increment local errors for which no corrective action can be taken (deformable objects can be highly under-actuated).

2.2.2 Multi-scale scope

Given an unfeasible target shape (conditioned by both the deformation properties of the object and the configuration of the grippers), a multi-scale analysis allows the error reduction of geometric features appearing at larger scales, while simultaneously preserving local geometric features.

2.2.3 Use of elastic mapping

Geometry-preserving elastic maps favour shape evolution paths of lower deformation. Conventional approaches such as (López-Nicolás et al., 2020) or (Navarro-Alarcon and Liu, 2017) use homogeneous mappings between contours and thus assume isometric deformations (i.e., stretching/compressing processes are assumed to be negligible). When dealing with non-isometric deformations, homogeneous mappings define errors that generate larger deformation paths (i.e., deviations that may even affect the feasibility of the task).

2.2.4 Joint intrinsic-extrinsic metric

\(E_{\mathrm{MSN}}\) is intrinsic, since it is defined by geodesic distances and local coordinates, and extrinsic, as it is based on the multi-scale mean normal curvature obtained from the extrinsic analysis of domains \(B(s,r)\). In shape control, a conventional approach is to minimise extrinsic energies that disregard the object's topology. The joint intrinsic-extrinsic nature of \(E_{\mathrm{MSN}}\) makes it aware of topology changes and non-isometric deformations (Bronstein et al., 2007) leading to gentler deformation paths.

Figure
Figure 3. Two experiments involving the manipulation of a sweater. Both solve the same shape control problem by reducing the shapes' Fourier-based spectral energy. Experiment 1.1 makes use of homogeneous contour mapping whereas Exp. 1.2 uses an elastic mapping that considers resemblance of local geometric features (i.e., \(E_{\mathrm{FT}}\) in (6)). The results are analysed through \(E_{\mathrm{MSN}}\) to validate it as a deformation measure. On the top, the initial shape state \(\gamma(s,t_0)\) (red dashed line), the shape evolution \(\gamma(t)\) (red line) and the target shape \(\bar{\gamma}\) (blue dashed line) are shown with the contour maps: thin lines linking the contours in red for the homogeneous map (Exp. 1.1) and in gray for the elastic map \(\Pi(s,t)\) (Exp. 1.2). Three relevant frames per experiment are shown (bottom left), corresponding to the time moments framed in black rectangles on the top sequences. First plot on the bottom (centred) shows the evolution of the shape error expressed in terms of \(E_{\mathrm{MSN}}(\gamma(t),\bar{\gamma})\). Second plot (bottom right) shows the evolution of the deformation cost \(E_{\mathrm{MSN}}(\gamma(t_0),\gamma(t))\) with respect to the shape error \(E_{\mathrm{MSN}}(\gamma(t),\bar{\gamma})\). In both plots, the minimum achieved error values are plotted with dashed lines. Metric \(E_{\mathrm{MSN}}\) properly represents the shape error evolution and the deformation path followed by each process. The homogeneous map deviates the shape evolution in Exp. 1.1 towards a higher deformation path (left sleeve folds inconveniently), thus hindering convergence to the desired shape. Cost \(E_{\mathrm{MSN}}(\gamma(t_0),\gamma(t))\) reflects this fact: the red line lies above the green line throughout the whole error evolution (second plot).

2.3 Extrinsically-defined shape energies

In this section, we revisit general definitions of the conventional extrinsic shape energies \(E_{\mathrm{P2P}}\), used in (López-Nicolás et al., 2020), (Berenson, 2013), (Cuiral-Zueco and López-Nicolás, 2021), (Shetab-Bushehri et al., 2022) or (Aranda et al., 2020), and \(E_{\mathrm{FT}}\), used in (Navarro-Alarcon and Liu, 2017). We endow them with elastic maps in order to fairly compare their performance with our proposed energy \(E_{\mathrm{MSN}}\) in upcoming sections.

The point-to-point (P2P) error, endowed with an elastic map \(\Pi\), can be formulated as: \[ \mathbf{e}_{\mathrm{P2P}}(s,t)=\mathbf{x}(s,t)-\left(\mathbf{R}^*\bar{\mathbf{x}}(\Pi(s,t))+\mathbf{t}^*\right), \tag{4} \] where \(\mathbf{x}(s,t),\bar{\mathbf{x}}(\Pi(s,t))\in\mathbb{R}^2\) are the extrinsic coordinates of the current and target contour points respectively. Transform \((\mathbf{R}^*, \mathbf{t}^*)\) is the orthogonal Procrustes rigid transform (Al-Aifari et al., 2013) that, by removing the rigid translation and rotation component from the point-to-point error, minimises the energy: \[ E_{\mathrm{P2P}}(\gamma(t),\bar{\gamma})=\begin{matrix} \\ \text{min}\\ \mathbf{R}^*,\mathbf{t}^*,\Pi \end{matrix} \int_{0}^{l(t)}\mathbf{e}_{\mathrm{P2P}}^\intercal\mathbf{e}^{}_{\mathrm{P2P}}\mathrm{d}s. \tag{5} \] Alternatively, energy \(E_{\mathrm{FT}}\) is defined in the frequency domain through the Fourier Transform. Consider the complex function \(z(s)=x(s)+y(s)i\), where the \(x(s)\) coordinates of \(\gamma(s)\) constitute the real term and the \(y(s)\) coordinates the imaginary term. The complex Fourier coefficients \(c_n,\bar{c}_n\) obtained from \(z(s),\bar{z}(\Pi(s))\) lead to the Fourier-based energy: \[ E_{\mathrm{FT}}(\gamma(t),\bar{\gamma})=\begin{matrix} \\ \text{min} \\ \Pi \end{matrix}\sum_{n=-\infty}^{\infty}\left | c_n-\bar{c}_n \right |^2. \tag{6} \] Note that the summation bounds in (6) can be truncated thus allowing to filter out high frequency (nosiy) components.

Shape energies \(E_{\mathrm{P2P}}\) and \(E_{\mathrm{FT}}\), defined in (5) and (6) respectively, are equivalent (up to a Procrustes transform).

The proof follows from the direct application of Parseval's Theorem (Parseval, 1806) to (6): \[ \sum_{n=-\infty}^{\infty}\left | c_n-\bar{c}_n \right |^2 =\int_{0}^{l(t)}\left | z(s)-\bar{z}(\Pi(s)) \right |^2 \mathrm{d}s \\ =\int_{0}^{l(t)}\| \mathbf{x}(s)-\mathbf{\bar{x}}(\Pi(s)) \|^2 \mathrm{d}s= \int_{0}^{l(t)}\mathbf{e}_{\mathrm{P2P}}^\intercal\mathbf{e}^{}_{\mathrm{P2P}}\mathrm{d}s. \tag{7} \]

2.4 Validation of $E_{\mathrm{MSN}}$ for OCSC

In this section we validate \(E_{\mathrm{MSN}}\), through experiments and simulations, as a metric that allows to quantify deformation and as a shape error that produces lower deformation paths in shape control.

The experiments presented along this paper are performed on the ABB Yumi dual-arm robot using colour-based object segmentation (in CIELAB colour space) and \(\alpha\)-shape contour extraction. The continuous elastic contour map \(\Pi(s,t)\) is computed as in (Cuiral-Zueco and López-Nicolás, 2021) and then sampled to interpolate values of \(\bar{\mathbf{l}}(\bar{s},r)\) in (2) with sub-pixel resolution. We avoid using any specific Jacobian update-rules (as in (Zhu et al., 2021) or (Navarro-Alarcon and Liu, 2017)) in order to compare the different energies as consistently as possible: Jacobians are experimentally estimated throughout the whole process in all the experiments.

To illustrate the performance of \(E_{\mathrm{MSN}}\) as a deformation measure, we use it to analyse the results of experiments 1.1 and 1.2 (Fig. 3). Both experiments involve the reduction of the shapes' Fourier-based spectral energy. However, Exp. 1.1 uses a homogeneous contour mapping, whereas Exp. 1.2 uses elastic mapping (i.e., \(E_{\mathrm{FT}}\) in (6)). As expected, the homogeneous mapping in Exp. 1.1 generates a larger deformation process. The proposed \(E_{\mathrm{MSN}}\), used to analyse the results, properly identifies such deviation.

Given lemma ref, either \(E_{\mathrm{P2P}}\) in (5) or \(E_{\mathrm{FT}}\) in (6) serve to illustrate the importance of \(E_{\mathrm{MSN}}\) being an intrinsic and extrinsic metric for OCSC. In order to provide a better insight on the characteristics of \(E_{\mathrm{MSN}}\) as object-compliant shape metric, a comparison between two simulations is illustrated in Fig. 4. The first simulation seeks to reduce the extrinsic energy \(E_{\mathrm{P2P}}(\gamma(t),\bar{\gamma})\) whereas the second simulation reduces \(E_{\mathrm{MSN}}(\gamma(t),\bar{\gamma})\). The control strategy based on the reduction of \(E_{\mathrm{MSN}}\) leads to lower measured deformation as illustrated in the plots of Fig. 4. Moreover, the deformation cost expressed in terms of \(E_{\mathrm{MSN}}(\gamma(t_0),\gamma(t))\) is representative of the deformation values obtained from the simulation (unlike deformation cost expressed as \(E_{\mathrm{P2P}}(\gamma(t_0),\gamma(t))\)).

Figure
Figure 4. Two simulations illustrate the performance of \(E_{\mathrm{MSN}}\) as shape error and deformation cost. A shape control problem is solved using two strategies, one reduces \(E_{\mathrm{P2P}}(\gamma(t),\bar{\gamma})\), the other \(E_{\mathrm{MSN}}(\gamma(t),\bar{\gamma})\). On the top sequences, the simulation deformation mesh (red triangles) and the gripper (black square) are shown along with other elements introduced in the description of Fig. 3. On the bottom, three plots illustrate the performance of \(E_{\mathrm{MSN}}\) as deformation energy by comparing it to the purely extrinsic energy \(E_{\mathrm{P2P}}\). The first plot shows, for both control strategies, the deformation path that shapes follow according to \(E_{\mathrm{P2P}}\), i.e., it plots deformation cost \(E_{\mathrm{P2P}}(\gamma(t_0),{\gamma}(t))\) with respect to shape error \(E_{\mathrm{P2P}}(\gamma(t),\bar{\gamma})\) (the \(E_{\mathrm{MSN}}\) based control, in green, leads to a larger cost path). The plot in the middle shows the deformation path in terms of \(E_{\mathrm{MSN}}\) (the \(E_{\mathrm{MSN}}\) based control, in green, leads to a significantly lower deformation path). The third plot validates the information provided by the \(E_{\mathrm{MSN}}\)-based analysis by showing the evolution of the actual deformation with respect to time. That is, the stretch measured in the simulation (computed as the sum of the absolute value of length variations on the mesh's segments) is significantly larger on the strategy that reduces \(E_{\mathrm{P2P}}\).

To further illustrate how our shape metric \(E_{\mathrm{MSN}}(\gamma(t),\bar{\gamma})\) inherently reduces not only the compression/extension strain but also the bending processes, in Fig. 5 we present Exp. 2.1 and Exp. 2.2. They involve the manipulation of an Ethernet cable that cannot be stretched or compressed (pure isometric deformations). Strategies based on the reduction of \(E_{\mathrm{P2P}}(\gamma(t),\bar{\gamma})\) and \(E_{\mathrm{MSN}}(\gamma(t),\bar{\gamma})\) are compared again in Exp. 2.1 and Exp. 2.2 respectively. This particular shape control problem constitutes a clear example of how seeking a local minimum in extrinsic energies such as \(E_{\mathrm{P2P}}(\gamma(t),\bar{\gamma})\) can lead to very large deformation processes that may even result in object self-intersections. See the high deformation and self-intersection of the cable in Exp. 2.1. On the other hand, the joint nature of \(E_{\mathrm{MSN}}(\gamma(t),\bar{\gamma})\) allows to untangle the cable and achieve a proper solution in Exp. 2.2. In Exp. 2.2 the intrinsic nature of the \(E_{\mathrm{MSN}}\) (SE(2) invariant) seeks to match the target contour's curvature regardless of its rigid configuration (position and orientation). That is, the strategy in Exp. 2.2 seeks object shape control rather than object position control, as Exp. 2.1 does.

Figure
Figure 5. Two experiments illustrate the importance of the joint nature (both intrinsic and extrinsic) of \(E_{\mathrm{MSN}}(\gamma(t),\bar{\gamma})\) with respect to extrinsic energies such as \(E_{\mathrm{P2P}}(\gamma(t),\bar{\gamma})\). An Ethernet cable that can only be deformed isometrically needs to be untangled in order to achieve the shape control task (gripper on the left is fixed). Control laws in experiments 2.1 and 2.2 seek reducing \(E_{\mathrm{P2P}}(\gamma(t),\bar{\gamma})\) and \(E_{\mathrm{MSN}}(\gamma(t),\bar{\gamma})\) (respectively). Elements of the shape sequences (and the video frames) are introduced in the description of Fig. 3. For both experiments, the first plot (centre) shows the evolution of \(E_{\mathrm{MSN}}(\gamma(t),\bar{\gamma})\) and the second plot (right) shows the evolution of \(E_{\mathrm{P2P}}(\gamma(t),\bar{\gamma})\). Reduction of \(E_{\mathrm{P2P}}\) (extrinsic) in Exp. 2.1 pursues a local minimum that leads to twisting the cable thus leading to large deformations, oscillations and self-occlusions that hinder the control process. Reducing \(E_{\mathrm{MSN}}\) in Exp. 2.2 allows to untangle the cable and achieve a proper final shape. The evolution of the \(E_{\mathrm{P2P}}\) energy of Exp. 2.2 (green plot on the right) shows the need of escaping a local minimum, i.e., temporarily increasing energy \(E_{\mathrm{P2P}}\).