Automated and standardized software benchmarking with Benchalot

Published: Apr 7 2025

Topics: Open source tools

The growing complexity of the hardware Antmicro helps its customers build and deploy software workloads on requires continuous benchmarking and optimization to track, understand and fix performance bottlenecks. In our work with tools such as Verilator or OpenROAD we are on a never-ending quest for reducing memory usage, decreasing execution time, and, ultimately, improving the productivity of our customers’ silicon and software teams.

Benchmarking software often involves comparisons between an array of commits that vary in their input parameters, which is typically done manually or by writing dedicated scripts. To automate this time-consuming task and shorten optimization and debugging turnaround, we created Benchalot, a configurable, universal CLI tool for running and analyzing benchmarks. Benchalot lets developers specify a matrix of parameters they wish to iterate over, uses this data to create a set of benchmarks, then runs them and visualizes the results.

In the article below, we go into details about Benchalot’s features, describe how to configure and use the tool as well as what types of output formats it can use for aggregating, visualizing and analyzing results.

Benchalot illustration

Customizable benchmarks with Bechalot

Benchalot allows the user to specify a matrix of parameters that are then used to automatically create multiple benchmarks. Each benchmark is then executed, producing results which can be aggregated and visualized.

Benchalot is configured using YAML files, such as the one shown below:

matrix:
  version: [v1.0, v1.1, v1.2]
  input: [data1, data2, data3]

prepare:
  - build {{version}}
benchmark:
  - run {{input}}

This configuration file will result in 9 benchmarks, and Benchalot will measure the execution time of each run command:

build v1.0
run data1

build v1.0
run data2

build v1.0
run data3

build v1.1
run data1

build v1.1
run data2

build v1.1
run data3

build v1.2
run data1

build v1.2
run data2

build v1.2
run data3

Benchalot can also automatically create different output formats, such as Markdown and HTML tables, scatter plots, box plots, violin plots and bar-charts. You can enable this by adding a results section to the configuration file, as in the example below showing how to create a bar chart:

results:
  plot:
    filename: "plot.png"
    format: "bar"
    x-axis: version
    facet: input

Benchalot also provides more advanced features, such as:

system setup - Benchalot can change system options to reduce variance between time measurements
custom metrics - metrics can be defined by any command
compound variables - matrix variables can contain nested fields, allowing finer control over creating benchmarks
stages - benchmarks can be divided into stages, with measurements for each stage gathered separately

To learn more about Benchalot’s features, refer to the README.

Benchmarking Verilator

To illustrate Benchalot’s features, we created a demo showcasing how it can be used to benchmark Verilator, a popular open source RTL simulator which Antmicro is actively contributing to.

To reproduce the demo, first clone the Cores-VeeR-EL2 repository which we will use as a test subject:

git clone --recursive https://github.com/chipsalliance/Cores-VeeR-EL2.git

Next, clone the Verilator repository and build verilator:

git clone --recursive https://github.com/verilator/verilator.git
cd verilator
autconf
./configure
make -j`nproc`

Then create the config.yml file:

matrix:
  table: ["-fno-table", "-ftable"]
  const: ["-fno-const", "-fconst"]
  inline: ["-fno-inline", "-finline"]
  gate: ["-fno-gate", "-fgate"]
env:
  BUILD_PATH: snapshots/default
  RV_ROOT: $HOME/Cores-VeeR-EL2
  VERILATOR: $HOME/verilator/bin/verilator
  BUILD_DIR: build
cwd: $BUILD_DIR
setup:
  - $RV_ROOT/configs/veer.config -target=default -iccm_enable=1
prepare:
  - ccache -C
benchmark:
  verilation:
      - cat ../default_args | envsubst | xargs $VERILATOR {{table}} {{const}} {{inline}} {{gate}}
  compilation:
      - cp $RV_ROOT/testbench/test_tb_top.cpp obj_dir/
      - make -e -C obj_dir -f Vtb_top.mk VM_PARALLEL_BUILDS=1
  simulation:
      - cp $RV_ROOT/testbench/hex/user_mode0/cmark_iccm.hex program.hex
      - ./obj_dir/Vtb_top --test-halt
conclude:
  - rm -r console.log exec.log obj_dir program.hex trace_port.csv
results:
  table:
    format: "md"
    filename: "summary.md"
    pivot: "{{stage}} [s]"

To start the benchmarks, run:

mkdir build
benchalot config.yml

Benchalot will test Verilators’s performance by measuring how much time it takes to perform the verilation, compilation and simulation steps for VeeR-EL2 with different combinations of flags (table, const, inline and gate) which disable or enable different internal optimization stages. The results will be later aggregated in a Markdown table.

This run produced the following results:

Verilator benchmarking results

Based on the results we can assess that optimization flags have a significant impact on compilation and simulation times. Depending on the selected optimization configuration, simulation can take anywhere from around 8 seconds to over 100. According to the table, verilation took the least time when only the -fgate flag was enabled, as there was less optimization work to do. However, disabling all optimizations makes it slightly slower, likely due to operating on a more complex AST. When both -fgate (gate elimination) and -finline flags were enabled, compilation and simulation were the fastest, likely due to -finline creating new opportunities for gate elimination. -fconst (constant propagation) showed no significant effect on any of the times when combined with -fgate.

Using Benchalot we were able to run these 16 benchmarks automatically, and we can easily modify the configuration file to include different Verilator versions or flags.

Toolchain optimization with Antmicro

Tools like Benchalot help Antmicro and its customers standardize and automate the creation of reproducible benchmarking setups, especially important when dealing with large and time-consuming designs, e.g. performance assessment in the context of scalability or improving modeling accuracy and precision in tools like Verilator. Based on reliable data, we can cater specific toolchains to the advanced use cases of our clients, be it hardware, ML or SW related.

If you are developing a user-facing toolchain for your silicon device or using a complex toolchain in your own development which you believe could be improved using a data-driven approach, don’t hesitate to reach out to us at contact@antmicro.com, and visit our interactive offer portal to learn more about our engineering services.

Automated and standardized software benchmarking with Benchalot

Customizable benchmarks with Bechalot

Benchmarking Verilator

Toolchain optimization with Antmicro

Tags:

verilator

openroad

systemverilog

verilog

chips-alliance

See Also:

Running Linux and Zephyr on RZ/G2L in Renode

Automatic clock gating in OpenROAD

TAGS

© 2009-2025 Antmicro

AREAS

TECHNOLOGIES

SERVICES

PLATFORMS

ANTMICRO