HAL
-iree-hal-assign-legacy-target-devices
link
Assigns the HAL devices the module will target to the given list of targets.
Assigns target HAL devices to the module based on the given list.
Optionslink
-target-registry : Target registry containing the list of available devices and backends.
-targetBackends : List of target backends to assign as device targets.
-iree-hal-assign-target-devices
link
Assigns the HAL devices the module will target to the given list of target specifications.
Assigns target HAL devices to the module based on the given list of target specifications.
Targets can be specified in several ways depending on whether there are
multiple devices, named devices, or devices imported from external files.
Human-friendly device aliases can be used as shorthand for
IREE::HAL::TargetDevice
implementations providing their own configuration.
The aliases are identical to those used by #hal.device.alias<>
.
If multiple targets are specified they will be available as multiple
distinct devices. A single device may select from one or more targets such
that the first enumerated that matches at runtime will be selected. For
example a gpu
device may select between CUDA, HIP, or Vulkan at runtime
based on what kind of device the user has and what HAL implementations were
compiled into the runtime.
Examples using the canonical flag:
// Two devices, one the local host device and the other a Vulkan device:
--iree-hal-target-device=local
--iree-hal-target-device=vulkan
// One device selecting between Vulkan if available and otherwise use the
// local host device:
--iree-hal-target-device=vulkan,local
// Two CUDA devices selected by runtime ordinal; at runtime two --device=
// flags are required to configure both devices:
--iree-hal-target-device=cuda[0]
--iree-hal-target-device=cuda[1]
// A fully-defined target specification:
--iree-hal-target-device=#hal.device.target<"cuda", {...}, [#hal.executable.target<...>]>
// Named device for defining a reference by #hal.device.promise<@some_name>:
--iree-hal-target-device=some_name=vulkan
Optionslink
-targetDevices : List of target device specifications.
-iree-hal-capture-executable-sources
link
Captures individual hal.executable.variant source listings and embeds them in the IR.
Captures a source listing of each hal.executable.variant and attaches the source to the variant embedded in the IR. Entry points are assigned locations in the IR relative to the captured source.
Optionslink
-stage : Name used to indicate what stage of compilation is captured.
-iree-hal-configure-executables
link
Configures hal.executable ops via a nested translation pipeline.
Runs a nested pipeline on each executable to attach target-specific configuration information to variants.
Optionslink
-target-registry : Target registry containing the list of available devices and backends.
-iree-hal-configure-target-executable-variants
link
Configures hal.executable.variant ops for the specified target backend.
Attaches target-specific configuration information to a variant controlling how code generation operates.
Optionslink
-target-registry : Target registry containing the list of available devices and backends.
-target : Target backend name whose executable variants will be configured by this pass.
-iree-hal-conversion
link
Converts from stream and other intermediate dialects into the hal dialect.
Converts supported intermediate dialects (stream
, util
, and various
upstream dialects like cf
/scf
) into the hal dialect. After conversion
host code scheduling work and allocations will act on !hal.device
queues
and !hal.buffer
(and other) resources.
It's expected that executable interface materialization has been performed so that the information required to marshal buffers and operands to the device is available for conversion.
-iree-hal-dump-executable-benchmarks
link
Dumps standalone hal.executable benchmarks to the provided path.
Dumps one MLIR file per hal.executable containing the executable contents
and the host code required to dispatch them with fake buffers and operands.
These benchmarks can be run with the iree-benchmark-module
tool to
microbenchmark individual dispatches outside of the whole program context.
The pass can only be run after executable translation but before host code conversion as the original stream dialect ops are required to synthesize the benchmarks.
There are many caveats with this approach and it will fail to generate benchmarks in many cases such as dynamic shapes, dynamic operands, or stateful data dependencies. Users should always prefer to build dedicated benchmarks in their origin framework that can be guaranteed to match their expectations and use appropriate test data. For example some dispatches may produce NaNs or out-of-bounds accesses with the fake data generated by this pass and either crash or result in unrepresentative performance.
In other words: don't blindly expect this pass to do anything but act as a starting point for microbenchmarking. Verify the outputs, the benchmarking methodology for the particular dispatch, and prepare to do more work. Or just author proper benchmarks in the original framework!
Optionslink
-path : File system path to write each executable benchmark MLIR file.
-iree-hal-dump-executable-sources
link
Dumps individual hal.executable source listings to the provided path.
Dumps a source listing of each hal.executable and updates the source locations in the IR to point at the produced files. This allows for easy inspection of each executable prior to translation and gives downstream tools that can display source information (Tracy, perf, etc) something more useful than the entire original source program.
Optionslink
-path : File system path to write each executable source MLIR file.
-prefix : String to prefix the written file names with.
-iree-hal-elide-redundant-commands
link
Elides stateful command buffer ops that set redundant state.
Identifies sequences of stateful command buffer operations such as
hal.command_buffer.push_descriptor_set
that set redundant state that arise
from trivial conversion from the stateless stream dialect and removes them
to reduce binary size and runtime overhead.
-iree-hal-hoist-executable-objects
link
Hoists local executable object annotations to the parent hal.executable.variant
.
Finds all hal.executable.objects
attrs on all ops within an executable
inner module and moves them to the parent hal.executable.variant
op.
-iree-hal-initialize-devices
link
Initializes global device handles based on their specification.
Initializes each global !hal.device
based on the specification attribute
by building initializers that enumerate and select the appropriate device.
Optionslink
-target-registry : Target registry containing the list of available devices and backends.
-iree-hal-inline-memoize-regions
link
Inlines hal.device.memoize
regions into their parent region.
Inlines any hal.device.memoize
ops into their parent region and removes
the op. This prevents memoization and has the same behavior as having never
formed the memoization regions.
-iree-hal-link-all-executables
link
Links hal.executable ops into one or more hal.executable ops.
Runs a nested pipeline to link multiple hal.executable
ops together if the
target backend the executables are used with desires.
Optionslink
-target-registry : Target registry containing the list of available devices and backends.
-iree-hal-link-target-executables
link
Links executables for the specified target backend.
Links together multiple hal.executable
ops for the given target backend if
desired. Linking allows for intra-module deduplication and amortization of
startup time, code size, and runtime overheads that come from managing
multiple hundreds/thousands of executables.
Optionslink
-target-registry : Target registry containing the list of available devices and backends.
-target : Target backend name whose executables will be linked by this pass.
-iree-hal-materialize-dispatch-instrumentation
link
Materializes host and device dispatch instrumentation resources on stream IR.
Adds dispatch instrumentation for both host and device prior to materializing interfaces so that the higher-level stream dialect can be used to easily mutate the dispatch sites, executable exports, and resources used for instrumentation storage.
Optionslink
-buffer-size : Power-of-two byte size of the instrumentation buffer.
-iree-hal-materialize-interfaces
link
Defines hal.executable variants for stream.executable ops.
Defines hal.executables and one hal.variant for each required target. The interfaces required to marshal buffers and operands across the host-device boundary are declared on the executables and annotated on the dispatch sites so that subsequent conversion can consume them.
-iree-hal-materialize-resource-caches
link
Materializes cached globals for device resources.
Scans the program for resource lookups such as hal.executable.lookup
and
materializes globals initialized on startup. The original lookup ops are
replaced with global loads of the cached resources.
-iree-hal-materialize-target-devices
link
Materializes global device handles based on a hal.device.targets
spec.
Materializes global !hal.device
ops for the devices specified by the
hal.device.targets
attribute on the module. An optional default device can
be specified to assign to ops that do not have a default device specified.
Optionslink
-defaultDevice : Which device is considered the default when no device affinity is specified.
-iree-hal-memoize-device-queries
link
Finds hal.device.query ops and creates variables initialized on startup.
Finds all hal.device.query
-related ops that are hoistable and moves them
into globals that are initialized on startup. This prevents repeated queries
at runtime and allows for optimization as queries are CSEd across the entire
program.
-iree-hal-outline-memoize-regions
link
Outlines hal.device.memoize
regions and creates global resources.
Outlines any hal.device.memoize
ops in the module by creating functions
and per-device globals with initializers.
-iree-hal-preprocess-executables-with-pipeline
link
Preprocess each executable with an MLIR pass pipeline.
Runs the given MLIR pass pipeline as parsed by the --pass-pipeline=
flag
on each hal.executable in the program. The passes must be linked into the
compiler to be discovered.
Optionslink
-pipeline : MLIR pass pipeline description to run on each executable.
-iree-hal-preprocess-executables-with-tool
link
Preprocess each executable with an external command line tool.
Passes each hal.executable in the program to the given command line tool
as stdin and parses the resulting MLIR from stdout to replace them. This
is equivalent to iree-hal-preprocess-executables-with-pipeline
but allows
for an external mlir-opt
/iree-opt
-like tool to be used containing the
pipelines instead of requiring the passes to be linked into the compiler.
Optionslink
-command : stdin->stdout command to run on each hal.executable MLIR op.
-iree-hal-prune-executables
link
Prunes executable variants and exports that are not referenced.
Prunes executable variants and exports that are not referenced in the module. This is intended to be run late in the pipeline where no new dispatches will be inserted that may require the variants or exports that it removes.
-iree-hal-repeat-dispatches
link
Repeats each hal.command_buffer.dispatch op one or more times.
Finds all hal.command_buffer.dispatch ops and repeats them the specified number of times by cloning them and inserting a barrier. This is extremely unreliable and nearly always creates incorrect programs that have wildly incorrect end-to-end execution timings. It must only be used when trying to profile (via sampling or performance counters) specific dispatches in-situ with the additional caveat that cache behavior and dispatch overhead are invalid. Do not trust any numbers produced by this method of benchmarking without verifying via external tooling.
This should rarely be used. Prefer instead to build real benchmarks in origin frameworks that, for example, use independent data and ensure correct execution results (as if you're benchmarking known-incorrect results, are you really benchmarking something useful?). Any benchmarking of memory-bound operations using this approach will be questionable (such as matmuls, which we use this for today... heh ;).
Optionslink
-count : Number of times to repeat each dispatch (including the original).
-iree-hal-resolve-device-aliases
link
Resolves #hal.device.alias
attributes to their expanded configurations.
Resolves device aliases to the concrete targets using defaults, flags, and registered device configurations.
Optionslink
-target-registry : Target registry containing the list of available devices and backends.
-iree-hal-resolve-device-promises
link
Resolves #hal.device.promise
attributes to their devices.
Resolves promised device affinities to the materialized device globals that were promised. Verifies that all promises are resolved.
-iree-hal-resolve-export-ordinals
link
Resolves symbolic hal.executable.export references to ordinals.
Severs symbolic references to hal.executable.export ops from dispatch sites by replacing them with the ordinal assigned to the exports. This allows for subsequent passes to collapse the executables into opaque blobs.
-iree-hal-serialize-all-executables
link
Converts hal.executable.variants to one or more hal.executable.binary ops.
Runs a nested pipeline on each executable to serialize its variants from
their low-level MLIR dialects (such as llvm
, spirv
, etc) to their
target-specific object format (static/shared libraries, SPIR-V, etc).
Optionslink
-target-registry : Target registry containing the list of available devices and backends.
-debug-level : Debug level for serialization (0 (no information) to 3 (all information)).
-dump-intermediates-path : Path to write translated executable intermediates (.bc, .o, etc) into for debugging.
-dump-binaries-path : Path to write translated and serialized executable binaries into for debugging.
-iree-hal-serialize-target-executables
link
Serializes executables for the specified target backend.
Serializes variants for the target backend from their low-level MLIR
dialects (such as llvm
, spirv
, etc) to their target-specific object
format (static/shared libraries, SPIR-V, etc).
Optionslink
-target-registry : Target registry containing the list of available devices and backends.
-target : Target backend name whose executables will be serialized by this pass.
-debug-level : Debug level for serialization (0 (no information) to 3 (all information)).
-dump-intermediates-path : Path to write translated executable intermediates (.bc, .o, etc) into for debugging.
-dump-binaries-path : Path to write translated and serialized executable binaries into for debugging.
-iree-hal-strip-executable-contents
link
Strips executable module contents for reducing IR size during debugging.
A debugging pass for stripping translated executable contents (LLVM dialect, SPIR-V dialect, etc) to reduce IR size and noise from the device-only code.
-iree-hal-substitute-executables
link
Substitutes hal.executable ops with files on disk.
Substitutes hal.executable ops with externally referenced MLIR files or target-specific object files. When provided a .mlir/.mlirbc file with a top-level hal.executable the entire executable will be replaced including all variants contained with. All other files such as .o, .bc, and .spv will be set as external object files on the original executable variants and the original contents will be dropped.
Substitutions can be specified by providing a file system path where there exists files matching the executable names in one of the supported formats or by specifying the file each executable name maps to directly.
Optionslink
-substitutions : Substitution `executable_name=file.xxx` key-value pairs.
-search-path : Path to source executable substitutions from.
-iree-hal-translate-all-executables
link
Translates hal.executable ops via a nested translation pipeline.
Runs a nested pipeline on each executable to translate its variants from
their generic MLIR dialects (such as linalg
) to their target-specific
dialects (llvm
, spirv
, etc).
Optionslink
-target-registry : Target registry containing the list of available devices and backends.
-iree-hal-translate-target-executable-variants
link
Translates hal.executable.variant ops for the specified target backend.
Translates an executable variant for a specific target from its generic
MLIR dialects (such as linalg
) to the target-specific dialects (llvm
,
spirv
, etc).
Optionslink
-target-registry : Target registry containing the list of available devices and backends.
-target : Target backend name whose executable variants will be translated by this pass.
-iree-hal-verify-devices
link
Verifies that all devices can be targeted with the available compiler plugins.
Verifies that #hal.device.target
and #hal.executable.target
attributes
reference targets that are registered with the compiler.
Optionslink
-target-registry : Target registry containing the list of available devices and backends.