Translation Validation for LLVM's AArch64 Backend

Ryan Berger, Mitch Briles, Nader Bushehri, Nicholas Coughlin, Kait Lam, Nuno P. Lopes, Stefan Mada, Tanmay Tirpankar, John Regehr

 

Abstract:

LLVM's backends translate its intermediate representation (IR) to assembly or object code. Alongside register allocation and instruction selection, these backends contain many analogues of components traditionally associated with compiler middle ends: dataflow analyses, common subexpression elimination, loop invariant code motion, and a first-class IR—MIR, the "machine IR." In effect, this kind of compiler backend is a highly optimizing compiler in its own right, with all of the correctness hazards entailed by a million lines of intricate C++. As a step towards gaining confidence in the correctness of work done by LLVM backends, we have created ARM-TV, which formally verifies translations between LLVM IR and AArch64 (64-bit ARM) code. Ours is not the first translation validation work for LLVM, but we have advanced the state of the art along multiple fronts: ARM-TV is a checking validator that enforces numerous ABI rules; we have extended Alive2 (which we reuse as a verification backend) to deal with unstructured mixes of pointers and integers that are typical of assembly code; we investigate the tradeoffs between hand-written AArch64 semantics and those derived mechanically from ARM's published formal semantics; and, we have used ARM-TV to discover 46 previously unknown miscompilation bugs in this LLVM backend, most of which are now fixed in upstream LLVM.

 

Published:

R. Berger, M. Briles, N. Bushehri, N. Coughlin, K. Lam, N. P. Lopes, S. Mada, T. Tirpankar, J. Regehr. Translation Validation for LLVM's AArch64 Backend. Proc. of the ACM on Programming Languages, Oct. 2025.

 

Download:

 

Bibtex:

@article{armtv-oopsla25,
  title =	{Translation Validation for {LLVM}'s {AArch64} Backend},
  author =	{Ryan Berger and Mitch Briles and Nader Bushehri and Nicholas Coughlin and Kait Lam and Nuno P. Lopes and Stefan Mada and Tanmay Tirpankar and John Regehr},
  journal =	{Proc. of the ACM on Programming Languages},
  number =	{OOPSLA},
  publisher =	{ACM},
  month =	oct,
  year =	2025
}

<-- Return