Siamese Object Tracking for Vision-Based UAM Approaching with Pairwise Scale-Channel Attention


Visual approaching to the target object is crucial to the subsequent manipulating of the unmanned aerial manipulator (UAM). Although the manipulating methods have been widely studied, the vision-based UAM approaching generally lacks efficient design. The key to the visual UAM approaching lies in object tracking, while current approaching generally relies on costly model-based methods. Besides, UAM approaching often confronts more severe object scale variation issues, which makes it inappropriate to directly employ state-of-the-art model-free Siamese-based methods from the object tracking field. To address the above problems, this work proposes a novel Siamese network with pairwise scale-channel attention (SiamPSA) for vision-based UAM approaching. Specifically, SiamPSA consists of a scale attention network (SAN) and a scale-aware anchor proposal network (SA-APN). SAN acquires valuable scale information for feature processing, while SA-APN mainly attaches scale-awareness to anchor proposing. Moreover, a new tracking benchmark for UAM approaching, namely UAMT100, is recorded with 35K frames on a flying UAM platform for evaluation. Exhaustive experiments on the benchmark and real-world tests validate the efficiency and practicality of SiamPSA with a promising speed. Both the code and UAMT100 benchmark are now available at

In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, pp. 1-8, 2022.

SiamPSA_workflow An overview of the proposed Siamese tracking with pairwise scale-channel attention (SiamPSA) for UAM approaching.

Guangze Zheng
PhD Candidate in Computer Science at HKU, China