Scalerunner: open source compute cluster

Published:

Topics: Open hardware, Open cloud systems

With Moore’s law no longer to be taken for granted, there’s a real need to find new ways to scale compute capability to keep up with the ever increasing demand. Antmicro is helping customers tackle this problem on multiple levels, developing distributed, edge computing systems to bring processing closer to the data sources, and building new, collaborative methodologies and open source building blocks for ASIC and FPGA as part of RISC-V and CHIPS Alliance. There is also the most obvious method - scaling horizontally by throwing more machines at the problem.

Antmicro’s projects often require running complex parallelized CI workloads which, on top of significant computational resources, also need a great deal of flexibility in terms of management, deployment and availability. In order to address this, Antmicro’s cloud infrastructure and hardware teams have been working on scalable clusters built with COTS components and custom open source hardware, such as the Scalenode baseboard or the FPGA-based BMC. Creating open source hardware and software building blocks allows Antmicro’s teams to build setups which can be freely reproduced both internally and for customers with the customizations required for their specific use case. Those building blocks jointly constitute Scalerunner, Antmicro’s open source compute cluster project, which will be described in this note.

Scalerunner hardware overview

The hardware architecture of the Scalerunner cluster is composed of a coordinator unit which orchestrates the processing between multiple Scalenodes. A single Scalenode is a compute node executing specific tasks such as software compilation or testing, and exists in many variants spanning across the ARM, x86 and RISC-V architectures in Antmicro’s server rooms. Within one cluster, some nodes will play the role of coordinator, and nodes of different types can be used together - in one of the setups photographed below, the Scalenode is an ARM-based node, while the coordinator is a x86-64 machine.

This concept can be easily implemented with multiple hardware platforms. Since a very common use case in Antmicro’s work involves massive parallel builds of ARM-based software, one of the recent variants that were deployed uses the Raspberry Pi 4 platform. Those are packed into custom-designed and passively-cooled enclosures which can be easily stacked into multiple rows.

In this version of the Scalerunner cluster, each enclosure includes a Raspberry Pi 4 with a PoE hat and a custom-designed LED indicator which can represent the actual load and/or temperature of the node. The passively cooled, stackable enclosures for Raspberry Pi 4 were recently released on GitHub.

The enclosure occupies 2U in a rack-mounted setup. It uses a 35mm DIN rails for easy snap-on which makes it compatible with a variety of rack-mounted equipment. A single row of a 19” rack encompasses a set of 14 compute nodes. It is possible to stack them one behind another to get a complete “farm” of 42 units.

Scalerunner running on a cluster of ARM-based RPi Scalenodes

In order to squeeze in even more ARM-based compute nodes, for another configuration of the ARM-based version of the Scalerunner cluster Antmicro’s hardware team designed a Scalenode CM4 baseboard - a dedicated PCB targeting a height of 1U, which was also described in more detail in a previous blog note. This baseboard is currently predominantly used with ARM-based modules such as, in the case of this particular setup, a Raspberry Pi Compute Module 4, but can also work with other SoMs in this format, like the RISC-V based ARVSOM.

The Scalenode baseboard features an expansion connector which can be used to interface with external development kits in order to create an automated hardware-in-the-loop setup. For dogfooding the project, Antmicro’s FPGA tooling team created such a setup for the landmark F4PGA projectf4pga-examples. By tapping into physical hardware, the team is able to reliably verify FPGA bitstreams produced in CI pipelines.

Both of the compute nodes mentioned above use PoE as the main power source. That simplifies the installation and maintenance since a single node requires only one cable for power and data.

To enable managing network traffic, measuring energy consumption and power cycling nodes individually, the clusters are deployed together with a fully-managed PoE network switch.
This allows the infrastructure team to monitor the power consumption of specific nodes and gives them full control over selective power-cycling of the nodes.

Scalerunner running on a cluster of x86-based NUC Scalenodes

Software architecture overview

Much like with other custom systems Antmicro is implementing in commercial projects, the Scalerunner cluster uses open source software, starting with the operating system. At the core of the clusters there is a tailored Linux distribution built from source which boots in seconds, serving as a common environment for both the coordinator and as a hypervisor for each of the compute nodes within a single system. In the case of Scalenodes, the operating system is fetched from the coordinator and is loaded into the RAM by the Preboot Execution Environment (PXE). This lets the cluster automatically provision and configure a large number of nodes by simply updating the operating system image on the coordinator machine.

Diagram depicting Scalerunners architecture

Thanks to its ephemeral nature, the environment helps to ensure hermetic builds and removes the burden of having to manage it on an individual level. All the necessary tools required to perform the computation are instantly available and are tightly coupled to the single file operating system image.

Furthermore, the Scalerunner cluster also features custom orchestration software enabling Antmicro to harness the compute power of the clusters to execute highly isolated and flexible software CI pipelines within various Continuous Integration systems. This includes not only Antmicro’s internal cloud CI and version control system (of course based on open source), but also external solutions such as GitHub Actions.

Each job is executed in a container on the Linux kernel running in a KVM-backed virtual machine. Apart from ensuring a proper resource isolation, it also provides a great deal of flexibility in terms of choosing the userspace – it’s just a matter of picking the right container image. This intuitive and predictable environment makes for a pleasant development experience and enables fast iteration and collaboration.

Comprehensive server solutions

Built with scalability and openness in mind, Scalerunner extends Antmicro’s portfolio of hybrid datacenter solutions. By connecting various open source pieces of software and adding a sprinkle of custom hardware, we ensure that the products we build are future-proof and, thanks to their modularity, can adapt to changing requirements. With software, hardware, tooling, infrastructure and AI teams working in close collaboration, Antmicro offers end-to-end services ranging from software development, to massive hybrid CI systems including highly automated hardware-in-the-loop setups.

If you are interested in full-stack open source solutions for your next generation cloud infrastructure, don’t hesitate to reach out at contact@antmicro.com.

See Also: