Comparison and usage of multiple algorithms for Single Object Tracking in ROS


Topics: Open OS, Open software libraries, Edge AI

Object detection and tracking is one of the most popular tasks of computer vision. The former consists in identifying and locating objects of a certain class in digital images and is widely used in traffic control, autonomous shops, autonomous vehicles, etc. The latter is a process of establishing the location of a moving object over time using consecutive frames from camera input.

Even though it is possible, using object detection for tracking can be computationally expensive and is limited only to tracking objects that are classifiable by the detector. Object tracking, on the other hand, is class-agnostic and can track any object that was marked for tracking, focusing only on this object by looking for its distinctive features and matching them in consecutive frames.

Object tracking can be applied in a variety of areas, e.g. augmented reality, video editing, traffic control, security and surveillance, or robotics. In the latter field, it can be used to help a robot keep track of objects of interest while the viewpoint changes due to the robot’s or the target’s movement. Such algorithms must be fast enough to detect sudden changes in the observed environment in order to enable fast responses and prevent losing track of quickly moving objects.

One of example use cases where AI-assisted tracking is applied to solve a practical industrial problem is Antmicro’s industrial 3D camera platform which can identify and follow valuable ore in a rock sorting system deployed as part of the X-MINE project.

While there are several approaches to create frameworks to unify learning and evaluation of trackers, it is hard to find a tool that would act as an interface between a larger, complex system and a particular tracker implementation, enable switching between various algorithms during runtime and allow the user to compare the results in a real-world deployment.

Robot Operating System (ROS)

ROS logo

A very popular framework for coupling together various data inputs and outputs (with processing in between) is the somewhat misleadingly named Robot Operating System (ROS). Rather than an operating system like Linux and Zephyr, ROS is an open source framework originally designed to run on top of Linux (or, most recently, also RTOS), constituting a set of libraries and tools simplifying the creation of robotics and vision systems.

ROS allows both industrial and academic users to create large projects with multiple subprograms (called nodes) and provides an API for communication between them, including distribution across many devices, and its wide adoption and useful tooling integrations makes it an interesting interoperability layer for software originating in multiple sources.

ROS tracker manager

There are various tracking algorithms with different architectures, trained on different datasets and with different properties, e.g. robustness for different object types, against varying backgrounds, speed vs. quality etc.

Such diversity calls for enabling the easy deployment of various trackers, with a possibility to seamlessly switch between various algorithms and compare their applicability for a given task. What is more, combining various trackers and analyzing their output together could lead to more robust tracking.

This is one of the scenarios where ROS’ distributed nature comes especially useful. Given that we build edge AI systems which sometimes span across many processing units and/or devices, Antmicro’s AI team routinely uses ROS for data fusion and combining various processing nodes in a more abstract, modular fashion.

For the object tracking use case, we decided to create a convenient and easy-to-use tracker manager tool to switch between different trackers and their configurations during runtime, with a handy API that would enable easy deployment and comparison of various trackers, as well as a scoring system that can compute final bounding boxes based on multiple trackers. Regardless of the original programming language of the implementation, adding a new tracker is simplified to implementing functions for initializing the tracker and processing consecutive frames.

Inference based on the results of many trackers

Initially, this tool was created just to switch between and compare the results of different tracking algorithms, but later it was redesigned to enable running several trackers simultaneously and combining their results to improve tracking quality. The final architecture of the tracker manager is shown on the diagram below.

ROS tracking diagram

In this solution, communication between the manager and the tracking nodes is based on so-called topics (a ROS concept). The tracker manager publishes consecutive frames on one topic, then each tracker processes the frame and returns a response to its separate topic as fast as it can. Finally, the tracker manager gathers all responses from topics and combines them to return the final bounding box.

To combine the results from all trackers, the tracker manager uses scoring mechanics to reward fast and reliable trackers and punish slow and false ones. The overall score consists of multiple criteria such as response time, certainty and credibility.

Further work

The modular and easily extendible nature of ROS makes it convenient for researching the practical implications of combining various building blocks and approaches, within a standard toolkit and API framework. One of the developments which are in progress in this project - which would probably take much longer to bear fruit using a different approach - is utilizing semantic segmentation to further improve the tracking results. Segmentation could help recognize and subsequently ignore changes in the background that may mislead some trackers, with the trackers that lost track of the object within such a background being penalized. The plug-and-play nature of ROS is making this possible at a faster pace, with better comprehension and less uncertainty.

Modularity and interoperability are some of the reasons which motivate our use of open source standards to drive our day-to-day work. If you’d like to build a complex industrial system which has to integrate many components into one easy to use and highly adaptable system, and you are thinking ROS might be the solution to some of your problems, be sure to reach out to us at

See Also: