LLVM debugging playbooklink
This page aims to collect notes on how to debug or reduce issues that appear to arise from within LLVM itself and how to generate useful LLVM bug reports.
This guide contains platform-independent notes applicable to both CPU and GPU compilation. Additional GPU-specific notes (such as how to perform binary substitutions in an AMD GPU context) are contained within the GPU debugging playbook.
Generating LLVM IRlink
Wthen bisecting, reducing, or debugging an issue that might manifest within
LLVM, it can be helpful to use the
--iree-hal-dump-executable-intermediates-to=[directory]
(or the more general
--iree-hal-dump-executable-files-to=[directory]
) flags to iree-compile
or iree-opt
. These flags will cause IREE to write out the compiled LLVM module
to the specified directory so you can operate on it directly.
Generally, there will be a .linked
file, which contains the LLVM IR shortly after
it was generated by MLIR (though after steps like bitcode library linking where
applicable) and a .optimized
file, which contains the IR after the opt
passes
have been run. The .optimized
file may include reproduction instructions (if
it doesn't, the relevant compiler plugin hasn't been updated to add them).
Similarly, the final generated assembly (a .s
or .rocmasm
or so on) may
include reproduction instructions. Where those are present, they should be helpful
in manually recreating the LLVM compilation so that you no longer have to
route any changes through IREE.
For more details and other related flags, see the documentation on dumping intermediate files.
Tip
While opt
is "target-independent", many passes (such as vectorization)
have substantial dependencies on target information. Ensure your LLVM IR
contains a target triple
or that you're passing -mtriple=
to your opt
invocations. There are fewer dependencies on -mcpu=
, but it should also be
preserved to reduce debug variability.
LLVM binarieslink
To create LLVM binaries that run on the same commit as your IREE checkout, use
cmake --build [build-directory] --tragtet opt llc
to produce binaries in [build-directory]/llvm-project/bin
. You can similarly
produce other utility binaries such as llvm-reduce
, which aren't built by default.
Reducing optimization levels in IREElink
If you suspect an LLVM bug, try disabling (or reducing) one or both optimization
levels. LLVM has two places where a -O[n]
is applied: the middle-end
(opt
) and the backend/codegen (llc
). The backend optimization level
is selected by values of the llvm::CodeGenOptLevel
enum, which is passed to
a createTargetMachine
call in the compiler plugin. This level defaults to
-O3
. On the other hand, the generic/middle-end opt
optimization level
is controlled by llvm::OptimizationLevel
and defaults to -O2
currently.
Setting one or both of these values to the -O0
or -O1
equivalent
and seeing the issue go away is an indicator that there may be a LLVM bug
in play. It may, however, also indicate that there's a race condition or
other correctness issue in the generated LLVM IR that is masked by a lack
of compiler optimizations.
Useful flags for opt
and llc
link
-print-after-all
and-print-before/after=[passname]
can help locate places were suspect IR is introduced or where crashes occur, just as their MLIR equivalents can be used in IREE.-print-module-scope
ensures IR dumps include attributes and metadata if those are relevant- The exact process for feeding a binary back into IREE after manually compiling
it is target-specific, but will generally involve
--iree-hal-substitute-executable-object=[executable]=[filename]
. -global-isel=1
(changing the instruction selection system inllc
) can be helpful in localizing a bug to instruction selection. If it solves your problem (or turns it into a different bug), you've substantially narrowed down the code that needs to be searched.opt
produces human-readable output whin passed the-S
flag, and often needs a-o -
to send its results to standard output.llc
takes a--filetype={asm,obj}
argument to control whether assembly or assembled objects are produced.
llvm-diff
link
When adjusting an opt
invocation to isolate misbehaving passes or when
comparing LLVM IR from a working and a broken commit, you may be able to
use the llvm-diff
tool to compare two LLVM IR files without the noise that
is induced by LLVM's IR numbering scheme. Note that llvm-diff
output is written
to standard error and should be redirected with a 2>&1
.
llvm-reduce
link
In some cases - particularly compiler crashes, the llvm-reduce
program
(part of LLVM) may be useful. It takes a LLVM IR file and an "interestingness
script" which returns 0 (success) if there is a problem with a proposed
reduced input and fails otherwise.
When writing such a script, the not
tool (especially its
not --crash [program] [args]
mode) and FileCheck
from the LLVM test suite
are often useful.
For example, the interestingness script used to reduce the crash in issue #22001 was
#!/usr/bin/env bash
[llvm-bin]/not --crash [llvm-bin]/opt -passes='amdgpu-lower-buffer-fat-pointers' -disable-output "$@" 2>/dev/null
which was used with llvm-reduce -test=interesting.sh pre-buffer-loads.ll
on output created by adding
--print-before=amdgpu-lower-buffer-fat-pointers --print-module-scope
to the crashing llc
invocation (the crashing pass was located through
backtraces and --print-after-all
).
This is a helpful LLVM slide deck on how to operate llvm-reduce
.
These slides include other useful flags and tips.
Creating reproducerslink
If you're planning to file a bug against LLVM, it's helpful to create a small reproducer.
In many cases, an input that demonstrates the behavior you've identified as a bug
can be either created with llvm-reduce
or by hand.
However, in some cases (such as incorrect dispatch results that aren't clearly attributable to a particular change) all you can do is create a reproduction harness. The exact process for creating these is target-specific, but such a harness should be a piece of standalone code that links against / loads different versions of the misbehaving input, calls the function at issue, and reports the results (likely checking against a naive implementation).
This wrapper program should be accompanied by a simple build process that doesn't
depend on IREE and instructions on how to run it. The build should produce binaries
from LLVM IR - ideally, the post-opt
IR, by calling llc
(or, if needed, opt
).
If the bug goes away at different optimization levels, you should build a working and a non-working binary.