Testing guidelink

Like the IREE project in general, IREE tests are divided into a few different components and use different tooling depending on the needs of that component.

Test type	Test	Build system	Supported platforms
Compiler tests	iree_lit_test	Bazel/CMake	Host
Runtime tests	iree_cc_test	Bazel/CMake	Host/Device
	iree_native_test	Bazel/CMake	Host/Device
	iree_hal_cts_test_suite	CMake	Host/Device
Core E2E tests	iree_check_test	Bazel/CMake	Host/Device
	iree_static_linker_test	CMake	Host/Device

There are also more *_test_suite targets that groups test targets with the same configuration together.

Compiler testslink

Tests for the IREE compilation pipeline are written as lit tests in the same style as MLIR.

By convention, IREE includes tests for

printing and parsing of ops in .../IR/test/{OP_CATEGORY}_ops.mlir files
folding and canonicalization in .../IR/test/{OP_CATEGORY}_folding.mlir files
compiler passes and pipelines in other .../test/*.mlir files

Running a testlink

For the test iree/compiler/Dialect/VM/Conversion/MathToVM/test/arithmetic_ops.mlir

With CMake, run this from the build directory:

ctest -R iree/compiler/Dialect/VM/Conversion/MathToVM/test/arithmetic_ops.mlir.test

With Bazel, run this from the repo root:

bazel test //compiler/src/iree/compiler/Dialect/VM/Conversion/MathToVM/test:arithmetic_ops.mlir.test

Writing a testlink

For advice on writing MLIR compiler tests, see the MLIR testing guide. Tests should be .mlir files in test directory adjacent to the functionality they are testing. Instead of mlir-opt, use iree-opt, which registers IREE dialects and passes and doesn't register some unnecessary core ones.

As with most parts of the IREE compiler, these should not have a dependency on the runtime.

Configuring the build systemlink

In the Bazel BUILD file, create a iree_lit_test_suite rule. We usually create a single suite that globs all .mlir files in the directory and is called "lit".

load("//iree/build_tools/bazel:iree_lit_test.bzl", "iree_lit_test_suite")

iree_lit_test_suite(
    name = "lit",
    srcs = glob(["*.mlir"]),
    tools = [
        "@llvm-project//llvm:FileCheck",
        "//tools:iree-opt",
    ],
)

There is a corresponding CMake function, calls to which will be generated by our Bazel to CMake converter.

iree_lit_test_suite(
  NAME
    lit
  SRCS
    "arithmetic_ops.mlir"
  DATA
    FileCheck
    iree-opt
)

You can also create a test for a single file with iree_lit_test.

Runtime testslink

Tests for the runtime C++ code use the GoogleTest testing framework. They should generally follow the style and best practices of that framework.

Running a testlink

For the test /runtime/src/iree/base/bitfield_test.cc:

With CMake, run this from the build directory:

ctest -R iree/base/bitfield_test

With Bazel, run this from the repo root:

bazel test //runtime/src/iree/base:arena_test

Setting test environmentslink

Parallel testing for ctest can be enabled via the CTEST_PARALLEL_LEVEL environment variable. For example:

export CTEST_PARALLEL_LEVEL=$(nproc)

To use the Vulkan backend as test driver, you may need to select between a Vulkan implementation from SwiftShader and multiple Vulkan-capable hardware devices. This can be done via environment variables. See the generic Vulkan setup page for details regarding these variables.

For Bazel, you can persist the configuration in user.bazelrc to save typing. For example:

test:vkswiftshader --test_env="LD_LIBRARY_PATH=..."
test:vkswiftshader --test_env="VK_LAYER_PATH=..."
test:vknative --test_env="LD_LIBRARY_PATH=..."
test:vknative --test_env="VK_LAYER_PATH=..."

Then you can use bazel test --config=vkswiftshader to select SwiftShader as the Vulkan implementation. Similarly for other implementations.

Writing a testlink

For advice on writing tests in the GoogleTest framework, see the GoogleTest primer. Test files for source file foo.cc with build target foo should live in the same directory with source file foo_test.cc and build target foo_test. You should #include iree/testing/gtest.h instead of any of the gtest or gmock headers.

As with all parts of the IREE runtime, these should not have a dependency on the compiler.

Configuring the build systemlink

In the Bazel BUILD file, create a cc_test target with your test file as the source and any necessary dependencies. Usually, you can link in a standard gtest main function. Use iree/testing:gtest_main instead of the gtest_main that comes with gtest.

cc_test(
    name = "arena_test",
    srcs = ["arena_test.cc"],
    deps = [
        ":arena",
        "//iree/testing:gtest_main",
    ],
)

We have created a corresponding CMake function iree_cc_test that mirrors the Bazel rule's behavior. Our Bazel to CMake converter should generally derive the CMakeLists.txt file from the BUILD file:

iree_cc_test(
  NAME
    arena_test
  SRCS
    "arena_test.cc"
  DEPS
    ::arena
    iree::testing::gtest_main
)

There are other more specific test targets, such as iree_hal_cts_test_suite, which are designed to test specific runtime support with template configuration and is not supported by Bazel rules.

Code Coveragelink

Use the IREE_ENABLE_RUNTIME_COVERAGE CMake option to enable code coverage instrumentation and add synthetic targets for managing profiling state. Tests run with coverage enabled with automatically write profiles to the build directory and then the iree-runtime-coverage-export target can be built to export LCOV information for tooling/IDEs.

IREE core end-to-end (e2e) testslink

Here "end-to-end" means from the input accepted by the IREE core compiler (dialects like TOSA, StableHLO, Linalg) to execution using the IREE runtime components. It does not include tests of the integrations with ML frameworks (e.g. TensorFlow, PyTorch) or bindings to other languages (e.g. Python).

We avoid using the more traditional lit tests used elsewhere in the compiler for runtime execution tests. Lit tests require running the compiler tools on the test platform through shell or python scripts that act on files from a local file system. On platforms like Android, the web, and embedded systems, each of these features is either not available or is severely limited.

Instead, to test these flows we use a custom framework called check. The check framework compiles test programs on the host machine into standalone test binary files that can be pushed to test devices (such as Android phones) where they run with gtest style assertions (e.g. check.expect_almost_eq(lhs, rhs)).

Building e2e testslink

The files needed by these tests are not built by default with CMake. You'll need to build the special iree-test-deps target to generate test files prior to running CTest (from the build directory):

cmake --build . --target iree-test-deps

Running a Testlink

For the test tests/e2e/stablehlo_ops/floor.mlir compiled for the VMVX target backend and running on the VMVX driver (here they match exactly, but in principle there's a many-to-many mapping from backends to drivers).

With CMake, run this from the build directory:

ctest -R tests/e2e/stablehlo_ops/check_vmvx_local-task_floor.mlir

With Bazel, run this from the repo root:

bazel test tests/e2e/stablehlo_ops:check_vmvx_local-task_floor.mlir

Setting test environmentslink

Similarly, you can use environment variables to select Vulkan implementations for running tests as explained in the Runtime tests section.

Writing a testlink

These tests live in tests/e2e. A single test consists of a .mlir source file specifying an IREE module where each exported function takes no inputs and returns no results and corresponds to a single test case.

As an example, here are some tests for the MHLO floor operation:

func.func @tensor() {
  %input = util.unfoldable_constant dense<[0.0, 1.1, 2.5, 4.9]> : tensor<4xf32>
  %result = "mhlo.floor"(%input) : (tensor<4xf32>) -> tensor<4xf32>
  check.expect_almost_eq_const(%result, dense<[0.0, 1.0, 2.0, 4.0]> : tensor<4xf32>): tensor<4xf32>
  return
}

func.func @scalar() {
  %input = util.unfoldable_constant dense<101.3> : tensor<f32>
  %result = "mhlo.floor"(%input) : (tensor<f32>) -> tensor<f32>
  check.expect_almost_eq_const(%result, dense<101.0> : tensor<f32>): tensor<f32>
  return
}

func.func @negative() {
  %input = util.unfoldable_constant dense<-1.1> : tensor<f32>
  %result = "mhlo.floor"(%input) : (tensor<f32>) -> tensor<f32>
  check.expect_almost_eq_const(%result, dense<-2.0> : tensor<f32>): tensor<f32>
  return
}

Test cases are created in gtest for each public function exported by the module.

Note the use of util.unfoldable_constant to specify test constants. If we were to use a regular constant the compiler would fold away everything at compile time and our test would not actually test the runtime. unfoldable_constant adds a barrier that prevents folding. To prevent folding/constant propagate on an arbitrary SSA-value you can use util.optimization_barrier.

Next we use this input constant to exercise the runtime feature under test (in this case, just a single floor operation). Finally, we use a check dialect operation to make an assertion about the output. There are a few different assertion operations. Here we use the expect_almost_eq_const op: almost because we are comparing floats and want to allow for floating-point imprecision, and const because we want to compare it to a constant value. This last part is just syntactic sugar around

%expected = arith.constant dense<101.0> : tensor<f32>
check.expect_almost_eq(%result, %expected) : tensor<f32>

The output of running this test looks like:

[==========] Running 4 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 4 tests from module
[ RUN      ] module.tensor
[       OK ] module.tensor (76 ms)
[ RUN      ] module.scalar
[       OK ] module.scalar (79 ms)
[ RUN      ] module.double
[       OK ] module.double (55 ms)
[ RUN      ] module.negative
[       OK ] module.negative (54 ms)
[----------] 4 tests from module (264 ms total)

[----------] Global test environment tear-down
[==========] 4 tests from 1 test suite ran. (264 ms total)
[  PASSED  ] 4 tests.

The "module" name for the test suite comes from the default name for an implicit MLIR module. To give the test suite a more descriptive name, use an explicit named top-level module in this file.

Configuring the build systemlink

A single .mlir source file can be turned into a test target with the iree_check_test Bazel macro (and corresponding CMake function).

load("//build_tools/bazel:iree_check_test.bzl", "iree_check_test")

iree_check_test(
    name = "check_vmvx_local-task_floor.mlir",
    src = "floor.mlir",
    driver = "local-task",
    target_backend = "vmvx",
)

The target naming convention is "check_backend_driver_src". The generated test will automatically be tagged with a "driver=vmvx" tag, which can help filter tests by backend (especially when many tests are generated, as below).

Usually we want to create a suite of tests across many backends and drivers. This can be accomplished with additional macros. For a single backend/driver pair:

load("//build_tools/bazel:iree_check_test.bzl", "iree_check_single_backend_test_suite")

iree_check_single_backend_test_suite(
    name = "check_vmvx_local-task",
    srcs = glob(["*.mlir"]),
    driver = "local-task",
    target_backend = "vmvx",
)

This will generate a separate test target for each file in srcs with a name following the convention above as well as a Bazel test_suite called "check_vmvx_local-task" that will run all the generated tests.

You can also generate suites across multiple pairs:

load("//build_tools/bazel:iree_check_test.bzl", "iree_check_test_suite")

iree_check_test_suite(
    name = "check",
    srcs = ["success.mlir"],
    # Leave this argument off to run on all supported backend/driver pairs.
    target_backends_and_drivers = [
        ("vmvx", "local-task"),
        ("vulkan-spirv", "vulkan"),
    ],
)

This will create a test per source file and backend/driver pair, a test suite per backend/driver pair, and a test suite, "check", that will run all the tests.

The CMake functions follow a similar pattern. The calls to them are generated in our CMakeLists.txt file by bazel_to_cmake.

There are other test targets that generate tests based on template configuraton and platform detection, such as iree_static_linker_test. Those targets are not supported by Bazel rules at this point.

External test suiteslink

iree-test-suiteslink

Multiple test suites are under development in the iree-org/iree-test-suites repository.

Many program, input, and output files are too large to store directly in Git, especially in a monorepo, so test suites may use Git LFS, cloud storage, and persistent caches on test machines as needed.
Keeping tests out of tree forces them to use public project APIs and allows the core project to keep its infrastructure simpler.

ONNX operator testslink

Tests for individual ONNX operators are included at onnx_ops/ in the iree-org/iree-test-suites repository. These tests are generated from the upstream tests at onnx/backend/test/data/node/ in the onnxx/onnx repository.

Testing ONNX programs follows several stages:

graph LR
  Import -. "(offline)" .-> Compile
  Compile --> Run

This particular test suite treats importing as an offline step and contains test cases organized into folders of programs, inputs, and expected outputs:

Sample test case directory

test_case_name/
  model.mlir
  input_0.bin
  output_0.bin
  run_module_io_flags.txt

Sample run_module_io_flags.txt

--input=2x3xf32=@input_0.bin
--expected_output=2x3xf32=@output_0.bin

Each test case can be run using a sequence of commands like:

iree-compile model.mlir {flags} -o model.vmfb
iree-run-module --module=model.vmfb --flagfile=run_module_io_flags.txt

To run slices of the test suite, a pytest runner is included that can be configured using JSON files. The JSON files tested in the IREE repo itself are stored in tests/external/iree-test-suites/onnx_ops/. For example, here is part of a config file for running ONNX operator tests on CPU:

tests/external/iree-test-suites/onnx_ops/onnx_ops_cpu_llvm_sync.json
{
  "config_name": "cpu_llvm_sync",
  "iree_compile_flags": [
    "--iree-hal-target-device=local",
    "--iree-hal-local-target-device-backends=llvm-cpu",
    "--iree-input-demote-f64-to-f32=false"
  ],
  "iree_run_module_flags": [
    "--device=local-sync"
  ],
  "skip_compile_tests": [
    "onnx/node/generated/test_dequantizelinear",
    "onnx/node/generated/test_einsum_inner_prod",
    "onnx/node/generated/test_group_normalization_epsilon_expanded",
    "onnx/node/generated/test_group_normalization_example_expanded"
  ],
  "skip_run_tests": [
    "onnx/node/generated/test_gridsample_zeros_padding"
  ],
  "expected_compile_failures": [

Updating config fileslink

If the ONNX operator tests fail on a GitHub Actions workflow, check the logs for the nature of the failure. Often, a test is newly passing, with logs like this:

=================================== FAILURES ===================================
_ IREE compile and run: test_mod_uint64::model.mlir::model.mlir::cpu_llvm_sync _
[gw1] linux -- Python 3.11.9 /home/runner/work/iree/iree/venv/bin/python
[XPASS(strict)] Expected run to fail (included in 'expected_run_failures')

The workflow job that failed should then upload a new config file as an "Artifact", which can be downloaded from the action run summary page and then committed:

ONNX model testslink

Tests for ONNX models are included at onnx_models/ in the iree-org/iree-test-suites repository. These tests use models from the upstream onnx/models repository.

Like the ONNX operator tests, the ONNX model tests use configuration files to control which flags are used and which tests are run. The config files tested in the IREE repo itself are stored in tests/external/iree-test-suites/onnx_models/. For example, here is part of a config file for running ONNX model tests on CPU:

tests/external/iree-test-suites/onnx_models/onnx_models_cpu_llvm_task.json
{
  "config_name": "cpu_llvm_task",
  "iree_compile_flags": [
    "--iree-hal-target-device=local",
    "--iree-hal-local-target-device-backends=llvm-cpu",
    "--iree-llvmcpu-target-cpu=host"
  ],
  "iree_run_module_flags": [
    "--device=local-task"
  ],
  "tests_and_expected_outcomes": {
    "default": "skip",
    "tests/model_zoo/validated/vision/body_analysis_models_test.py::test_models[age_gender/models/age_googlenet.onnx]": "pass",
    "tests/model_zoo/validated/vision/body_analysis_models_test.py::test_models[age_gender/models/gender_googlenet.onnx]": "pass",

Unlike the ONNX operator tests, we do not run the full set of tests on every commit to iree-org/iree. Instead, we run a curated list of small tests that are expected to pass in iree-org/iree and then run the full set of tests nightly in iree-org/iree-test-suites.

sharktank testslink

Tests for small scale versions of Large Language Models (LLMs) and other Generative AI (GenAI) programs exported using the sharktank package built as part of the shark-ai project are included at sharktank_models/ in the iree-org/iree-test-suites repository.

Types of Sharktank tests:

Small scale versions of models
Quality tests for full models
Benchmarks for full models

The quality and benchmark test config files are stored in tests/external/iree-test-suites/sharktank_models.

SHARK-TestSuitelink

The nod-ai/SHARK-TestSuite repository also contains tests using IREE, llvm/torch-mlir, and nod-ai/shark-ai.

Some test coverage may overlap between SHARK-TestSuite and iree-test-suites, though some tests are planned to be migrated into iree-org/iree-test-suites once they mature and have demonstrated general utility to the upstream developer community.

Test reports for nightly runs in SHARK-TestSuite are uploaded to nod-ai/e2eshark-reports.