Fuzzing with libFuzzerlink
libFuzzer is a coverage-guided fuzzing engine provided by LLVM. It generates random inputs and mutates them based on code coverage feedback to find crashes, hangs, and memory errors.
IREE provides build infrastructure for creating libFuzzer-based fuzz targets that integrate with the existing build system.
When to use fuzzinglink
Fuzzing is most effective for:
- Parsers and decoders (UTF-8, binary formats, etc.)
- Serialization/deserialization code
- Input validation logic
- Any code that processes untrusted or external data
Enabling fuzzing buildslink
Bazellink
bazel build --config=fuzzer //runtime/src/iree/base/internal:unicode_fuzz
The --config=fuzzer flag enables coverage instrumentation and ASan.
CMakelink
cmake -B build -DIREE_ENABLE_FUZZING=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo
cmake --build build --target unicode_fuzz
Fuzzing automatically enables ASan. Fuzz targets are excluded from the default
all target and must be built explicitly.
Running fuzz targetslink
Fuzz targets are standalone executables that accept libFuzzer arguments:
# Run indefinitely (Ctrl+C to stop)
./build/runtime/src/iree/base/internal/unicode_fuzz
# Run for 60 seconds
./build/runtime/src/iree/base/internal/unicode_fuzz -max_total_time=60
# Use a corpus directory (recommended)
mkdir -p corpus/unicode
./build/runtime/src/iree/base/internal/unicode_fuzz corpus/unicode/
Common optionslink
| Option | Description |
|---|---|
-max_total_time=N |
Stop after N seconds |
-max_len=N |
Maximum input size in bytes |
-timeout=N |
Per-input timeout in seconds (0 to disable) |
-jobs=N |
Run N parallel fuzzing jobs |
-workers=N |
Number of worker processes for parallel fuzzing |
-dict=file |
Use a dictionary file for structured inputs |
-seed=N |
Use specific random seed for reproducibility |
See libFuzzer documentation for all options.
Writing fuzz targetslink
Fuzz targets implement the LLVMFuzzerTestOneInput function:
// my_fuzz.cc
#include <stddef.h>
#include <stdint.h>
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
// Process the fuzzer-generated input
my_function_under_test(data, size);
return 0; // Always return 0
}
Adding to the build systemlink
In BUILD.bazel:
load("//build_tools/bazel:build_defs.oss.bzl", "iree_runtime_cc_fuzz")
iree_runtime_cc_fuzz(
name = "my_fuzz",
srcs = ["my_fuzz.cc"],
deps = [
":my_library",
],
)
Then run python build_tools/bazel_to_cmake/bazel_to_cmake.py to generate the
CMake equivalent.
Best practiceslink
Maintain a corpuslink
Store interesting inputs in a corpus directory. The fuzzer uses existing corpus entries as seeds for mutation:
mkdir -p corpus/my_fuzz
./my_fuzz corpus/my_fuzz/ -max_total_time=3600
After finding bugs, minimize the corpus to remove redundant entries:
mkdir corpus/my_fuzz_minimized
./my_fuzz -merge=1 corpus/my_fuzz_minimized/ corpus/my_fuzz/
Add unit tests for found bugslink
When fuzzing discovers a crash:
- Minimize the reproducer:
./my_fuzz -minimize_crash=1 crash-xxx - Add the minimized input as a unit test case
- Fix the bug
- Verify the fix with the original crash input
This prevents regressions and documents the bug.
Use dictionaries for structured formatslink
For inputs with specific syntax (protocols, file formats), provide a dictionary:
# my_dict.txt
"keyword1"
"keyword2"
"\x00\x01\x02"
./my_fuzz -dict=my_dict.txt corpus/
Troubleshootinglink
Fuzzer runs slowlylink
- Ensure
CMAKE_BUILD_TYPE=RelWithDebInfoorRelease(Debug is very slow) - Check that the target doesn't do excessive I/O or allocations per iteration
- Use
-jobs=Nfor parallel fuzzing on multi-core machines
Out of memorylink
- Limit input size with
-max_len=N - Add early returns for oversized inputs in your fuzz target
- Use
-rss_limit_mb=Nto set memory limits
No new coveragelink
- Verify the target actually processes the input
- Check that coverage instrumentation is enabled (
-fsanitize=fuzzer-no-link) - Try seeding with representative inputs in the corpus
Timeout errorslink
libFuzzer kills inputs that take too long (default 1200 seconds). If you see
ALARM: working on the last Unit for N seconds followed by a timeout:
- Use
-timeout=Nto adjust the per-input timeout (in seconds) - Use
-timeout=0to disable timeouts entirely (useful for debugging) - Check if certain inputs cause algorithmic complexity issues (e.g., pathological regex patterns, deeply nested structures)