Efficient OTA updates with delta packages in Remote Device Fleet Manager

Published:

Topics: Open cloud systems, Open source tools

Antmicro’s engineering services cover the entire cycle of product development – from choosing suitable hardware, through preparing BSPs (board support packages), to in-field deployment and fleet management. We help build complete AI-enabled products such as inspection drones, satellite on-board computers, portable measurement devices or multi-camera arrays. Linux is the predominant OS used for the more industrially-focused and headless devices (supplemented by RTOSs like Zephyr on supporting MCU nodes), but in many user-facing applications – smart TV, automotive, infotainment or vending machines – Android is also a popular choice. Regardless of the operating system used, in most cases robust Over-The-Air and remote control mechanisms using available means of data transfer (e.g. wireless connection, LTE Internet access) are needed to carry the device throughout its entire lifecycle – be it for system updates or updating ML models for edge AI devices.

Some time ago, we introduced the Remote Device Fleet Manager (RDFM) project for deploying OTA updates to edge devices, including platforms based on ARM and RISC-V, running Linux, Android and/or Zephyr. In this article, we will discuss how RDFM makes such updates as small as possible thanks to delta updates and provides mechanisms for coping with unsuccessful updates, e.g. due to a power shortage during update, as well as provide instructions for deployment and testing on a target platform or in Antmicro’s open source Renode simulation framework.

Illustration depicting delta updates

Embedded Linux and Android software updates

When it comes to Android, system partitions are by default immutable (read-only) and user data is stored on separate partitions.

Regular Linux distributions, however, are mutable, so the system is installed once and is subsequently modified by system package upgrades. This approach does not guarantee any particular consistency between devices running the distro especially if it has been installed on one device lately, and many years earlier on another.

Immutable images can be created by using reproducible build systems, such as Yocto or Buildroot, which can be run on Continuous Integration (CI) servers automatically during a software release cycle. This ensures that the devices are, provably, all running the same system. But modifying a running system from the inside is no longer possible, since we do not replace individual files and directories, but rather the image as a whole.

Fitting another system onto the device

As we described in detail in a previous blog note, there are several approaches to reproducible OTA system updates. Instead of using an external updater, we can put together an internal update system. Such a system would reside on a separate partition and be responsible for downloading a full OS image and streaming it directly to the main partition to save storage space and avoid copying data around. When the main system decides to update, it reboots into a helper system, passing the URL of the system image to be installed. Once successful, the new system is launched and, in case of a failure, the recovery environment is invoked. This is called an asymmetric partition layout.

Another option is to fit two equivalent partitions into the storage device, creating a symmetric partition layout. One partition is active (running) while the other is passive (inactive) and unused. Such a system can download an update into the inactive partition during normal operation and then reboot, reducing downtime to the necessary minimum. Naturally, this approach requires twice as much storage space, but this is arguably a small sacrifice for an efficient update methodology.

Incremental updates with RDFM

For the most part, Android supports both delta and full packages out of the box. RDFM can therefore be used to perform delta updates on embedded Android devices using the stock Android OTA packages by integrating them with RDFM’s management tools. In this post, however, we will focus on our Linux implementation that required additional work.

The easiest and most common approach in embedded solutions when it comes to full system updates is to receive a compressed image of an entire system. In most cases it is not too troublesome since systems built with Yocto are minimal and tailored to their specific use cases.

However, with the growing number of features, also in edge AI solutions where models included in the system can be quite large, updating an entire system image using update packages may become extremely slow and faulty due to limited connectivity or low Internet speeds.

The good news is that in most system updates, the vast majority of the data contained in the new image data is already present on the device’s main partition. Therefore, with access to the old image (e.g. identified by its checksum) both on the device and the server, we can prepare a delta update for the device, which will instruct it to take some parts of the new image out of its currently active image, and others from the delta update itself.

RDFM uses HTTPS polling for secure deployment, which means that no ports are open on the device during the update process. This helps ensure security and integrity of the update process. With support for delta updates, RDFM allows users to download just the changes that occurred between firmware versions, rather than entire firmware images.

Delta updates implementation

For each required pair of a base image and a new image, the server creates an on-demand delta update using the librsync-go library. To get an idea of how the library works internally, let us look into its flow and binary format, using a tool called rdiff (which ships with many distros).

The base image is converted into a ‘delta signature’ which is basically a series of sector checksums, where a sector is each 4 KiB portion of the file (this can be adjusted, but 4 KiB works fine for the demonstration). The delta signature is used in conjunction with the new image to create a delta update, which, along with the old image, is used to recreate the new partition image.

For this example, we will create two nearly-identical ext4 partitions, each with a file named some-file, the former containing first content and the second containing other content.

+ truncate -s 2M partition.old
+ mkfs.ext4 -O ^64bit -O ^has_journal partition.old
+ cp -a partition.old partition.new
+ echo write some-file1 some-file | debugfs -w partition.old
debugfs 1.46.2 (28-Feb-2021)
Allocated inode: 12
+ echo write some-file2 some-file | debugfs -w partition.new
debugfs 1.46.2 (28-Feb-2021)
Allocated inode: 12

Creating a signature is as simple as calculating a ‘weak’ and a ‘strong’ checksum of each sector of the base image, where a weak 32-bit sum is computed with a simple sequence of multiplications and additions, and a strong sum is Blake2B-256, optionally cropped to its prefix, giving a maximum length of 32 bytes. The file begins with a header consisting of a 32-bit magic number followed by a 32-bit sector size and a 32-bit strong sum length. The rest of the file is a sequence of (weak,strong) pairs.

A hexdump of delta sig

Generating a delta is the most complicated stage. First, the delta signature is parsed to form a mapping from weak sums to pairs (strong,offset). Then a rolling weak checksum is kept (e.g. the first loop iteration knows a checksum of 0..4095, the second 1..4096, third 2..4097 etc.). If a matching sector is found at in the base image, a `COPY ` command is emitted. If it is not, a `LITERAL 1 b` command is emitted (where `b` is the first byte of the current sector) and the byte is skipped. Adjacent commands are grouped before being written to the delta file. The file starts with a 32-bit magic number, then has a series of `COPY` or `LITERAL` commands and ends with a termination command (a null byte).

A hexdump of delta

Reconstructing the new image from a delta is straightforward as well – it involves processing the commands in sequence and streaming their concatenated ‘meaning’ into the new image: a COPY <offset> <length> command decompresses to bytes from in the base image into the new image, and a `LITERAL ` command decompresses to the following bytes from the delta file itself. As you can see in the following listing, it can reconstruct the new image successfully.

+ /usr/bin/rdiff patch partition.old partition.new.delta partition.new.reconstr
+ sha1sum partition.new*
db2925156b1987fb652d20226621b1480c8d0516  partition.new
db2925156b1987fb652d20226621b1480c8d0516  partition.new.reconstr

RDFM in meta-antmicro

Yocto is an open source build system for creating custom Linux distributions for a wide range of devices. As described in a previous blog post, we maintain our own Yocto layer called meta-antmicro, which contains configuration files for building reproducible embedded Linux system images along with various recipes for software and libraries. The meta-antmicro layer includes sublayers such as meta-rdfm and meta-rdfm-tegra, which provide support for the RDFM OTA tool for various platforms, including ARM-based NVIDIA Jetson ones.

Together, the meta-antmicro Yocto layer and its sublayers provide a powerful and flexible way to build custom BPSs that support OTA updates with RDFM and can be used on various platforms.

meta-antmicro
├── meta-rdfm
│   ├── classes
│   ├── conf
│   ├── README.md
│   ├── recipes-bsp
│   ├── recipes-core
│   ├── recipes-devtools
│   ├── recipes-rdfm
│   └── scripts
│
├── meta-rdfm-tegra
│    ├── classes
│    ├── conf
│    ├── README.md
│    ├── recipes-bsp
│    └── recipes-rdfm
│
└── meta-rdfm-...

Integrating RDFM with new target platforms

RDFM manages dual partition switching from bootloader level and can be easily integrated with U-Boot or Grub bootloaders on multiple device systems such as ARM, x86 or RISC-V. Integrating RDFM into a new machine can typically be divided into the following steps:

  • Integrating RDFM code into the bootloader
  • Configuring the partition layout
  • Defining extended machine configurations.

The meta-antmicro layer and its dependencies come with most of the configuration files and patches needed for general board integration, but there may still be some specific configurations that need to be performed. For example, the RISC-V-based HiFive Unmatched U-Boot code requires a few features to be integrated into the include/configs/sifive-unleashed.h file in order for RDFM to function correctly:

#define CONFIG_BOOTCOUNT_LIMIT

/* This will store the U-Boot environment file on the memory card, 
 * before the first partition start */
#define CONFIG_ENV_IS_IN_MMC

These changes can later be formatted as a patch and applied in Yocto during the build routine.
In meta-rdfm, rdfm-part-images.bbclass is responsible for generating a proper kickstart (.wks) file with the following a partition layout:

Device                                       Start      End  Sectors   Size Type
rdfm-image-minimal-unmatched.flash.sdimg1       34     2081     2048     1M HiFive FSBL
rdfm-image-minimal-unmatched.flash.sdimg2     2082    10273     8192     4M HiFive BBL
rdfm-image-minimal-unmatched.flash.sdimg3    16384    17407     1024   512K Microsoft basic data
rdfm-image-minimal-unmatched.flash.sdimg4    32768    33791     1024   512K Microsoft basic data
rdfm-image-minimal-unmatched.flash.sdimg5    49152   730725   681574 332,8M Microsoft basic data
rdfm-image-minimal-unmatched.flash.sdimg6   737280 11223039 10485760     5G Linux filesystem
rdfm-image-minimal-unmatched.flash.sdimg7 11223040 21708799 10485760     5G Linux filesystem
rdfm-image-minimal-unmatched.flash.sdimg8 21708800 21970943   262144   128M Linux filesystem

On HiFive Unleashed and HiFive Unmatched, the first two partitions are reserved by FSBL and BBL to start up the payload. The U-Boot environment is stored on the third partition and the fourth is used as a redundant one.

The last partition is a persistent data partition reserved for user-defined and RDFM temporary files. When using delta updates, the root filesystem is read-only in order to ensure it remains intact, meanwhile the data partition is mutable and shared between both root filesystems.

When building an RDFM Yocto image, custom files can be included in the data partition just by installing it in the /data directory under the root partition. What’s worth mentioning is that any files added to the /data directory are not included in .rdfm artifacts, since they don’t contain a data partition. Only complete partitioned images (.flash.sdimg) will contain the files.

do_install() {
    install -d ${D}/data
    install -m 0644 <file> ${D}/data/
}

Additional Yocto configurations or features unique to a particular machine should be added into separate configuration files in the conf/machine/ directory and can later be used in the local.conf file via the require statement.

RDFM/meta-antmicro support for HiFive Unmatched

Several OTA update frameworks are available for different architectures, but few provide support for RISC-V-based devices. Nevertheless, SiFive’s HiFive family boards are widely supported in Yocto by the meta-riscv and meta-sifive layers, and recently we have added support for RDFM Over-The-Air updates for HiFive Unmatched in the kirkstone release in the meta-rdfm Yocto layer.

In the system-releases directory in the kirkstone Yocto release, we have added a demonstration release called rdfm-unmatched-demo for the RISC-V HiFive Unmatched board, which can be used to build BSPs, deploy them and perform OTA updates. More details regarding board-specific instructions can be found in the project’s README.md.

Delta package updates on HiFive Unmatched in Renode simulation

You can run an OTA update demo on the HiFive Unmatched board by following the instructions in the README.md.

To run the demo on a board simulated in Renode, refer to the README.md in the /renode subdirectory.

In the clip below, you can see a simulated HiFive Unmatched board updated with a delta package:

Deploy secure and efficient OTA updates with RDFM

With a diverse portfolio of open source tools, including, next to RDFM and Renode, the interactive System Designer and the Kenning ML framework, Antmicro offers not only comprehensive fleet management services with OTA delta updates for a multitude of edge AI platforms, but also CI integration, AI deployment and optimization, and more.

If you’re interested in adopting our OTA update methodology for your existing device fleet, or would like to benefit from professional end-to-end design and implementation services for AI-enabled devices, don’t hesitate to contact us at contact@antmicro.com.

See Also: