Ryan Berger, Mitch Briles, Nader Bushehri, Nicholas Coughlin, Kait Lam, Nuno P. Lopes, Stefan Mada, Tanmay Tirpankar, John Regehr
LLVM's backends translate its intermediate representation (IR) to assembly or object code. Alongside register allocation and instruction selection, these backends contain many analogues of components traditionally associated with compiler middle ends: dataflow analyses, common subexpression elimination, loop invariant code motion, and a first-class IR—MIR, the "machine IR." In effect, this kind of compiler backend is a highly optimizing compiler in its own right, with all of the correctness hazards entailed by a million lines of intricate C++. As a step towards gaining confidence in the correctness of work done by LLVM backends, we have created ARM-TV, which formally verifies translations between LLVM IR and AArch64 (64-bit ARM) code. Ours is not the first translation validation work for LLVM, but we have advanced the state of the art along multiple fronts: ARM-TV is a checking validator that enforces numerous ABI rules; we have extended Alive2 (which we reuse as a verification backend) to deal with unstructured mixes of pointers and integers that are typical of assembly code; we investigate the tradeoffs between hand-written AArch64 semantics and those derived mechanically from ARM's published formal semantics; and, we have used ARM-TV to discover 46 previously unknown miscompilation bugs in this LLVM backend, most of which are now fixed in upstream LLVM.
R. Berger, M. Briles, N. Bushehri, N. Coughlin, K. Lam, N. P. Lopes, S. Mada, T. Tirpankar, J. Regehr. Translation Validation for LLVM's AArch64 Backend. Proc. of the ACM on Programming Languages, Oct. 2025.
@article{armtv-oopsla25, title = {Translation Validation for {LLVM}'s {AArch64} Backend}, author = {Ryan Berger and Mitch Briles and Nader Bushehri and Nicholas Coughlin and Kait Lam and Nuno P. Lopes and Stefan Mada and Tanmay Tirpankar and John Regehr}, journal = {Proc. of the ACM on Programming Languages}, number = {OOPSLA}, publisher = {ACM}, month = oct, year = 2025 }