Detection and tracking dataset generator


Topics: Edge AI, Open machine vision

Antmicro offers comprehensive system design services comprising software and hardware development for AI-capable vision systems which perform complex tasks in industrial, security, smart city and other scenarios. To build them, we use open source, permissively licensed solutions that give our customers full control over the final product, such as our open hardware baseboards for the most popular AI modules, e.g. the Jetson Nano / Xavier NX Baseboard, Apalis Smart Vision Baseboard, UltraScale+ Processing Module, Google Coral Baseboard or TX2 Deep Learning Platform, as well as a range of expansion boards and accessories, all of which can be customized to meet specific needs of particular projects.
As may be easily deduced from the abovementioned selection of platforms, most of the custom products we help build involve some form of AI. A recurring theme in terms of what’s needed to build concrete AI applications is the management of datasets used for enabling specific use cases, which often leads us to creating new tools to assist in recurring tasks. One such example is the open source detection and tracking dataset generator that we’ve developed to significantly simplify the process of annotating datasets for training deep neural networks.

Dog in bounding box tracked by ai algorithm

Automating dataset generation

Datasets for common objects are often limited to non-commercial use, while those for non-standard objects are few and far between. That’s why you might find yourself facing the necessity to create your own dataset, which, apart from recording footage, involves the tedious process of manual annotation. We decided to automate this process by developing an easy-to-use object tracking and detection dataset generator which uses GOTURN - a deep learning based object tracking algorithm that is a pretty interesting project in its own right. Originally implemented in Caffe, and subsequently ported to the OpenCV Tracking API, GOTURN, unlike most real-time trackers, doesn’t rely on online learning algorithms but uses thousands of video sequences to learn the motion of an object in an offline manner.
In terms of catalogue structure and file format, the datasets produced by our generator correspond to the Amsterdam Library of Ordinary Videos (ALOV) data set, which is an extensive library of videos that aims to cover a wide and diverse range of scenarios and circumstances, such as illuminations, transparency, possible confusion with other objects, low contrast, zoom, etc.

Ease of use and high accuracy

The dataset generator divides the video sequence into .jpg frames and launches a GUI with the first frame featuring the tracked object. By selecting the object for tracking, you initialize the GOTURN tracker, which then moves on to the next frame, suggesting a bounding box for a new location of the object in the frame and allowing the user to easily modify it in case it is not correct. The tracker works with very high accuracy, which means the whole annotation process is radically shortened, with very little manual corrections needed.


As part of the same effort, we have created an alov-to-yolo script, which can be used to convert the datasets for object tracking into ones for object recognition, while the images and annotations that make up the detection dataset can be used to train the classifier as well.
If you are building an AI-capable device to streamline and automate operations in your business, or to offer new capabilities to your customers and users, don’t hesitate to reach out to us at to find out how we can help you achieve your goals. Antmicro offers flexible hardware, software and AI engineering services and build tools to help businesses move faster in the edge AI space.

See Also: