Taming Undefined Behavior in LLVM

Juneyoung Lee, Yoonseung Kim, Youngju Song, Chung-Kil Hur, Sanjoy Das, David Majnemer, John Regehr, Nuno P. Lopes

Abstract:

A central concern for an optimizing compiler is the design of its intermediate representation (IR) for code. The IR should make it easy to perform transformations, and should also afford efficient and precise static analysis.
In this paper we study an aspect of IR design that has received little attention: the role of undefined behavior. The IR for every optimizing compiler we have looked at, including GCC, LLVM, Intel's, and Microsoft's, supports one or more forms of undefined behavior (UB), not only to reflect the semantics of UB-heavy programming languages such as C and C++, but also to model inherently unsafe low-level operations such as memory stores and to avoid over-constraining IR semantics to the point that desirable transformations become illegal. The current semantics of LLVM's IR fails to justify some cases of loop unswitching, global value numbering, and other important "textbook" optimizations, causing long-standing bugs.
We present solutions to the problems we have identified in LLVM's IR and show that most optimizations currently in LLVM remain sound, and that some desirable new transformations become permissible. Our solutions do not degrade compile time or performance of generated code.

Published:

J. Lee, Y. Kim, Y. Song, C. Hur, S. Das, D. Majnemer, J. Regehr, N. P. Lopes. Taming Undefined Behavior in LLVM. In Proc. of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2017.

Download:

Editor
PDF

Bibtex:

@inproceedings{undef-pldi17,
  title =	{Taming Undefined Behavior in {LLVM}},
  author =	{Juneyoung Lee and Yoonseung Kim and Youngju Song and Chung-Kil Hur and Sanjoy Das and David Majnemer and John Regehr and Nuno P. Lopes},
  booktitle =	{Proc. of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)},
  doi =		{10.1145/3062341.3062343},
  month =	jun,
  year =	2017
}

Copyright notice:

<-- Return