Skip to content

LLVM debugging playbooklink

This page aims to collect notes on how to debug or reduce issues that appear to arise from within LLVM itself and how to generate useful LLVM bug reports.

This guide contains platform-independent notes applicable to both CPU and GPU compilation. Additional GPU-specific notes (such as how to perform binary substitutions in an AMD GPU context) are contained within the GPU debugging playbook.

Generating LLVM IRlink

Wthen bisecting, reducing, or debugging an issue that might manifest within LLVM, it can be helpful to use the --iree-hal-dump-executable-intermediates-to=[directory] (or the more general --iree-hal-dump-executable-files-to=[directory]) flags to iree-compile or iree-opt. These flags will cause IREE to write out the compiled LLVM module to the specified directory so you can operate on it directly.

Generally, there will be a .linked file, which contains the LLVM IR shortly after it was generated by MLIR (though after steps like bitcode library linking where applicable) and a .optimized file, which contains the IR after the opt passes have been run. The .optimized file may include reproduction instructions (if it doesn't, the relevant compiler plugin hasn't been updated to add them).

Similarly, the final generated assembly (a .s or .rocmasm or so on) may include reproduction instructions. Where those are present, they should be helpful in manually recreating the LLVM compilation so that you no longer have to route any changes through IREE.

For more details and other related flags, see the documentation on dumping intermediate files.

Tip

While opt is "target-independent", many passes (such as vectorization) have substantial dependencies on target information. Ensure your LLVM IR contains a target triple or that you're passing -mtriple= to your opt invocations. There are fewer dependencies on -mcpu=, but it should also be preserved to reduce debug variability.

LLVM binarieslink

To create LLVM binaries that run on the same commit as your IREE checkout, use

cmake --build [build-directory] --tragtet opt llc

to produce binaries in [build-directory]/llvm-project/bin. You can similarly produce other utility binaries such as llvm-reduce, which aren't built by default.

Reducing optimization levels in IREElink

If you suspect an LLVM bug, try disabling (or reducing) one or both optimization levels. LLVM has two places where a -O[n] is applied: the middle-end (opt) and the backend/codegen (llc). The backend optimization level is selected by values of the llvm::CodeGenOptLevel enum, which is passed to a createTargetMachine call in the compiler plugin. This level defaults to -O3. On the other hand, the generic/middle-end opt optimization level is controlled by llvm::OptimizationLevel and defaults to -O2 currently.

Setting one or both of these values to the -O0 or -O1 equivalent and seeing the issue go away is an indicator that there may be a LLVM bug in play. It may, however, also indicate that there's a race condition or other correctness issue in the generated LLVM IR that is masked by a lack of compiler optimizations.

Useful flags for opt and llclink

  • -print-after-all and -print-before/after=[passname] can help locate places were suspect IR is introduced or where crashes occur, just as their MLIR equivalents can be used in IREE.
  • -print-module-scope ensures IR dumps include attributes and metadata if those are relevant
  • The exact process for feeding a binary back into IREE after manually compiling it is target-specific, but will generally involve --iree-hal-substitute-executable-object=[executable]=[filename].
  • -global-isel=1 (changing the instruction selection system in llc) can be helpful in localizing a bug to instruction selection. If it solves your problem (or turns it into a different bug), you've substantially narrowed down the code that needs to be searched.
  • opt produces human-readable output whin passed the -S flag, and often needs a -o - to send its results to standard output. llc takes a --filetype={asm,obj} argument to control whether assembly or assembled objects are produced.

llvm-difflink

When adjusting an opt invocation to isolate misbehaving passes or when comparing LLVM IR from a working and a broken commit, you may be able to use the llvm-diff tool to compare two LLVM IR files without the noise that is induced by LLVM's IR numbering scheme. Note that llvm-diff output is written to standard error and should be redirected with a 2>&1.

llvm-reducelink

In some cases - particularly compiler crashes, the llvm-reduce program (part of LLVM) may be useful. It takes a LLVM IR file and an "interestingness script" which returns 0 (success) if there is a problem with a proposed reduced input and fails otherwise.

When writing such a script, the not tool (especially its not --crash [program] [args] mode) and FileCheck from the LLVM test suite are often useful.

For example, the interestingness script used to reduce the crash in issue #22001 was

#!/usr/bin/env bash

[llvm-bin]/not --crash [llvm-bin]/opt -passes='amdgpu-lower-buffer-fat-pointers' -disable-output "$@" 2>/dev/null

which was used with llvm-reduce -test=interesting.sh pre-buffer-loads.ll on output created by adding --print-before=amdgpu-lower-buffer-fat-pointers --print-module-scope to the crashing llc invocation (the crashing pass was located through backtraces and --print-after-all).

This is a helpful LLVM slide deck on how to operate llvm-reduce. These slides include other useful flags and tips.

Creating reproducerslink

If you're planning to file a bug against LLVM, it's helpful to create a small reproducer.

In many cases, an input that demonstrates the behavior you've identified as a bug can be either created with llvm-reduce or by hand.

However, in some cases (such as incorrect dispatch results that aren't clearly attributable to a particular change) all you can do is create a reproduction harness. The exact process for creating these is target-specific, but such a harness should be a piece of standalone code that links against / loads different versions of the misbehaving input, calls the function at issue, and reports the results (likely checking against a naive implementation).

This wrapper program should be accompanied by a simple build process that doesn't depend on IREE and instructions on how to run it. The build should produce binaries from LLVM IR - ideally, the post-opt IR, by calling llc (or, if needed, opt).

If the bug goes away at different optimization levels, you should build a working and a non-working binary.