1. Introduction — Contour based object-compliant shape control

control is required in multiple applications such as industrial processes or domestic robotics (Jiménez, 2012)(Herguedas et al., 2019), where preserving the object's integrity can be a critical aspect for the success of the manipulation task. Explicitly considering and analysing the extent to which an object is deformed during a shape control task constitutes an important challenge. The amount of deformation the object undergoes may be disregarded when the object allows for a large range of deformations. However, we believe that avoiding unnecessary deformations is a more reliable approach. One may suggest the use of mechanical sensors (e.g. strain gauges, torque sensors) in order to avoid reaching mechanical limits of the object (e.g. elastic limit). However, the use of mechanical sensors relies on costly and/or object-invasive setups while not allowing full coverage of the object as positioning a large amount of mechanical sensors may be inconvenient in most applications. Therefore, we propose a vision-based object-compliant shape control (OCSC) framework to reduce the amount of deformation the object needs to undergo to acquire the target shape (see Fig. 1).

5pt

**Figure 1.** Comparison of two different solutions for the same shape control problem. The first one only considers extrinsic error between contour points whereas the second one defines the shape error through our proposed shape metric, resulting in a more object-compliant process (elements of the figure introduced in section 1.3).

**Figure 2.** Illustration of the problem setup and general shape control scheme. Object surfaces constitute the current and target shapes through their parameterised contour curves \(\gamma(s,t),\bar{\gamma}(\bar{s})\), with extrinsic representation through 2D coordinates \(\mathbf{x}(s)\), \(\bar{\mathbf{x}}(\bar{s})\). A contour map between \(\gamma(s,t)\) and \(\bar{\gamma}(\bar{s})\) is computed, serving as input for the shape control strategy that defines robot actions \(\mathbf{u}_g\) to be applied to the object. In this paper, we use Jacobian-based shape energy minimisation as shape control strategy. At each contour point \(s\), a local reference \(\Re(s)\) is defined through the tangent and normal vectors at \(s\) (see the detail at the right), where a geodesic ball \(B(s,r)\) of radius \(r\) is shown.

Some approaches in the vision-based shape control literature, without focus on OCSC, use a reduced number of shape features (such as feature points or segments) in order to perform the shape control task (e.g., (Navarro-Alarcon et al., 2016; Navarro-Alarcón et al., 2013)). This can be of use in certain applications, however, in these frameworks, features are blind to object parts that are not within their range of description and thus they are not suitable for OCSC, where a holistic analysis considers all of the object parts. Methods such as those presented in (Zhu et al., 2021; López-Nicolás et al., 2020) or (Navarro-Alarcon and Liu, 2017) analyse the object's shape in a global manner through its 2D visible silhouette and the use of homogeneous contour mappings; they are purely based on extrinsic errors^[1]. However, despite globally sampling the object's geometry, some control methods require filtering the shape's information (e.g., (Zhu et al., 2021), based on Principal Component Analysis). Such methods might be disregarding local yet critical deformation processes through filtering. We therefore focus on analysis like those proposed in (López-Nicolás et al., 2020), based on point-to-point error, or (Navarro-Alarcon and Liu, 2017), based on frequency domain error, as they provide a holistic error definition where filtering is not necessarily required from a theoretical standpoint.

Some characteristic examples of point-to-point (P2P) and frequency domain defined shape errors (or shape energies) are presented in (López-Nicolás et al., 2020) and (Navarro-Alarcon and Liu, 2017). Other methods tackle the shape error reduction defined by such energies while also incorporating additional terms that consider deformation cost. In particular, (Berenson, 2013), (Ruan et al., 2018) and (Hu et al., 2018) aim at reducing the P2P energy along with deformation costs associated to the object's strain. Deformation costs in (Berenson, 2013) and (Ruan et al., 2018) are based on the comparison of Euclidean point distances with respect to the geodesic distances between the at-rest object points. This deformation cost can be calculated directly in simulated processes where point correspondences between object states are known. However, in a real setup, such cost requires a mapping between object states. Method in (Ruan et al., 2018), rather than computing a mapping, approximates the derivative of the strain cost with the miss-alignment (i.e., cosine value) between the gripper's velocity vectors and the tangent vector (at the gripper's position) to the geodesic paths defined between pairs of grippers. This approximation assumes pure tensile stress along geodesics between grippers and deformations around the grippers to be representative of deformations on the whole object. Another geodesic-distance based deformation cost is proposed in (Hu et al., 2018), where the variations on local distance between landmark points are considered. Similarly to (Berenson, 2013), (Hu et al., 2018) relies on tracking specific object points and disregards bending stress and deformations that do not significantly change geodesic distances.

A challenging problem regarding vision-based OCSC is the definition of a holistic shape metric that, without depending on the object's texture, allows to generate low deformations between object states (rather than relying on additional deformation costs and/or constraints). Deformation costs and constraints can be used as support for such metric. However, neither the shape metric nor the additional constraints/costs should disregard changes in curvature: pronounced changes in curvature imply large bending stress while not producing significant changes on the object's geodesic distances.

In this paper we present a vision-based OCSC framework for deformable objects that lack visual texture (i.e., objects whose surface points cannot be tracked). Other approaches that tackle OCSC either rely on objects with rich texture (e.g., (Berenson, 2013; Hu et al., 2018)) or confine their analysis to the object regions that are close to the grippers (e.g., (Ruan et al., 2018)). Our proposed method analyses the visible contour of the object in order to quantify the amount of deformation it undergoes. We focus on shape control processes involving slow and isotropic deformations that allow to disregard inertia. The main contributions of this paper are:

ıtem A geometry-based deformation energy that constitutes a shape metric. Such metric considers the object's geometric features in a multi-scale level and thus allows to quantify deformations in a holistic manner. Furthermore, when directly used as shape control error, the proposed metric inherently leads to more object-compliant behaviours than other conventional metrics (e.g., metrics as in (López-Nicolás et al., 2020) or (Navarro-Alarcon and Liu, 2017)). We validate our shape metric as a deformation measure and as an object-compliant energy through comparisons in simulations and experiments. ıtem An OCSC framework that extends the use of the proposed shape metric and allows to introduce deformation costs and constraints considering both changes in length (i.e., tensile and compression stress) and curvature (i.e., bending stress) of texture-less objects. In the literature, approaches such as (Berenson, 2013), (Ruan et al., 2018) or (Hu et al., 2018) consider excessive stretching but disregard deformations induced by pure compression or bending.

The proposed object-compliant metric is introduced in section 2. We develop on the OCSC framework in section 3. The performance of the framework is illustrated in section 4, where we present comparative experiments using a dual-arm manipulator (see the attached video).

Consider 2D visual contour data extracted from video frames (see Fig. 2), that is a curve \(\gamma(s,t)\), parameterised by \(s\in \mathbb{R}\), obtained from the texture-less object being manipulated. Similarly, one can define the fixed target shape through a curve \(\bar{\gamma}(\bar{s})\), parameterised by \(\bar{s}\in \mathbb{R}\). Contour points have extrinsic global coordinates \(\mathbf{x}(s),\bar{\mathbf{x}}(\bar{s})\in\mathbb{R}^2\). We define local frames of reference \(\Re(s), \bar{\Re}(\bar{s})\) on \(\gamma(s), \bar{\gamma}(\bar{s})\) with orthogonal axes in the tangent and the normal space. The object is grasped by robotic grippers for which we assume proper grasping stability. Grippers \(g\) do not need to be visible or placed along the contour, they are modelled with single integrator dynamics and perform 3 DoF actions \(\mathbf{u}_g\in \mathbb{R}^2\times S^1\) (2 translations and 1 rotation).

Contour based object-compliant shape control

1 Introduction

1.1 Related work

1.2 Object-compliant method overview

1.3 Problem setup