The open source sv-bugpoint tool we described some time ago facilitates SystemVerilog tool debugging by minimizing bug-inducing SystemVerilog code subsets of designs. While sv-bugpoint is universal and can be used with any SystemVerilog tool, since its introduction it has been widely adopted by the Verilator community, where it’s used for automatically reducing SystemVerilog designs that trigger bugs in the Verilator RTL simulator, which makes it much easier to identify and fix them.
Recently, we’ve been working on optimizing sv-bugpoint by speculatively merging subsequent minimization attempts into a larger group verified in one try. With this approach, we’re now able to decrease sv-bugpoint’s execution time by 2x to 5x on a representative suite of real world test cases.
In this article we describe the methodology behind the optimizations and show how they can expedite and improve SystemVerilog tool debugging.
Speculative minimization
In a typical run, which we described in more detail in a previous article, sv-bugpoint spends most time on checking whether a given minimization attempt can be applied or should be rolled back. To optimize this process, we take advantage of two of its properties:
- On average, it is much faster to recognize and reject (rollback) a failed reduction attempt than to fully confirm the correctness of a successful one.
- Successful reduction attempts (commits) often form longer runs (if the last few consecutive parts of the code didn’t contribute to bug reproduction, it is probable that the next one also does not).
Our optimization approach works by speculatively merging subsequent minimization attempts into a larger group verified in one try. In case of a correct speculation, the number of costly commits is significantly decreased. Incorrect speculations adversely affect the performance by increasing the number of failed checks, but that cost is low in comparison to the gain. The diagram below shows the process of a step-by-step minimization flow:
Each capital letter corresponds to a syntax node that sv-bugpoint tries to reduce. The green rectangles show successful runs of the check script (commits), while the red ones mark failed attempts (rollbacks). The vertical lines visible in the bisection rows mark the bisection boundaries. The initial size of speculatively reduced groups is 4 nodes in every variant. All variants operate on the same set of nodes, and grouping doesn’t “unlock” new reduction opportunities in this case. Every node that wasn’t reduced by the baseline remains non-reducible by other variants.
In order to decide whether to combine nodes or not, a simple 1-bit prediction is used: if the last reduction attempt was successful, we “bet” that next attempts also will, and merge them. Otherwise, we perform minimizations one-by-one.
As can be seen in the step-by-step diagram, a simple 1-bit prediction can significantly decrease the count of successful invocations of the check script in the case of long runs of reducible nodes. However, the algorithm suffers when reducible and non-reducible nodes are interleaved or when the size of a reducible sequence is smaller than the size used in grouping (e.g. see nodes G, H, I, J). Additionally, the algorithm can detect long runs of failed attempts, in which case it gives up on grouping modifications together to prevent redundant rollbacks.
To address 1-bit prediction issues, we use bisection - rather than backing down straight to one-by-one removal after a failed attempt, we gradually scale down the size of the merged group. Let’s take nodes H, I, J, K for example:
- Removal of the whole group fails, so we try to minimize the first two nodes.
- H, I removal succeeds. We know that the remaining part (J, K) contains at least one irremovable node so it is pointless to try to remove them at once.
- J removal succeeds, K can be assumed as irremovable without any extra attempt, and skipped over.
Bisection alone does relatively well in cases when a reducible sequence does not exactly match the initial grouping size. However, long runs of non-reducible nodes are not anticipated, which results in an increased number of check script invocations in such cases.
A hybrid variant combines most of the advantages of the other configurations, with prediction mitigating the poor behavior of plain bisection in the case of long non-reducible runs. For that reason, this is the approach we chose to implement in sv-bugpoint as the new default.
Identifying bugs in Verilator with optimized sv-bugpoint
To showcase the efficacy of the speculative minimization approach, we’ll use a few real world examples of bugs we identified with sv-bugpoint in Verilator (and later fixed), summarized in the table below:
| Bug | Source code | Fix |
|---|---|---|
verilation_err | Caliptra RT | PR |
sim_vcd_mismatch | Caliptra RT | PR |
cpp_build_err | RISCV-DV | PR |
sim_null_deref | UVM for Verilator | PR |
With a hybrid speculative minimization configuration that combines up to 32 reductions (the default number of reductions in the optimized sv-bugpoint) into a single one based on 1-bit prediction and bisection algorithms, we were able to speed up sv-bugpoint’s operation by 2x to 5x compared to the baseline.
Expanding the SystemVerilog tool ecosystem
The initial version of sv-bugpoint was created within the EU-funded TRISTAN project focusing on improving IP and tooling in the RISC-V ecosystem, and we have since used it in various customer and internal projects, such as optimizing LLVM for use with Verilator.
The optimized sv-bugpoing can improve debugging of not only Verilator, but also other SystemVerilog tools such as OpenROAD or Synlig. As active members of the SV Tools Project within CHIPS Alliance, Antmicro continuously works on developing new and improving existing SystemVerilog tools to enable an open source-driven ASIC development workflow.
If you would like to learn more about Antmicro’s engineering services around developing, supporting and expanding SystemVerilog tools such as Verilator, reach out to us at contact@antmicro.com.
