Skip to content

'iree_codegen' Dialectlink

A dialect for common functionality used by IREE code generation.

This dialect is primarily meant to hold attributes that carry the state of the compilation when lowered to scalar code for an architecture. Typically, a backend starts by analyzing the entry point functions within the hal.executable.variant and deciding which compilation pipeline to chose. During this, even the values for parameters such as tile sizes, etc. are also decided. The rest of the compilation flow does not make any heuristic decisions, rather just looks at the values of the decision specified in attributes that belong to this dialect. This allows an external search to easily override the heuristics that are hard-coded within a backend.

Operationslink

iree_codegen.dispatch_config (Codegen::DispatchConfigOp)link

Holds the workgroup count computation for a dispatch.

Syntax:

operation ::= `iree_codegen.dispatch_config` $function_ref
              (`workgroup_size` `=` $workgroup_size^)?
              (`subgroup_size` `=` $subgroup_size^)?
              $body attr-dict

A module-level op that captures the workgroup count computation and dispatch metadata for a function. The region computes the workgroup count (x, y, z) from workload values passed as block arguments. The terminator must yield exactly 3 index values, matching the contract of hal.executable.export's workgroup count region.

The function_ref is a symbol reference to the corresponding func.func. The op itself is not a Symbol because it lives alongside the func.func of the same name in a builtin.module symbol table, and two ops cannot define the same symbol.

Example:

iree_codegen.dispatch_config @matmul
    workgroup_size = [64, 16, 1] subgroup_size = 64 {
  ^bb0(%w0: index, %w1: index, %w2: index, %w3: index):
    %0 = affine.apply affine_map<()[s0] -> (s0 ceildiv 256)>()[%w2]
    %c1 = arith.constant 1 : index
    iree_codegen.yield %0, %c1, %c1 : index, index, index
}

Traits: IsolatedFromAbove, SingleBlockImplicitTerminator<YieldOp>, SingleBlock

Attributes:link
AttributeMLIR TypeDescription
function_ref::mlir::FlatSymbolRefAttrflat symbol reference attribute
workgroup_size::mlir::DenseI64ArrayAttri64 dense array attribute
subgroup_size::mlir::IntegerAttr64-bit signless integer attribute

iree_codegen.extract_strided_metadata (Codegen::ExtractStridedMetadataOp)link

Extracts a buffer base with offset and strides.

Syntax:

operation ::= `iree_codegen.extract_strided_metadata` $source `:` type($source) `->` type(results) attr-dict

This op is implemented similarly to the upstream MemRef::ExtractStridedMetadataOp with the following differences.

  1. It does not fold away static offset/stride information. Hence unlike the upstream Op the link between the memref and consumers of the metadata is not broken when later passes change this information. A common example in IREE of this is buffer binding optimizations.

  2. Helper functions getConstifiedMixed{Offset|Strides|Sizes} are not implemented as the expectation is you should lower to the upstream op before using those functions if you need them.

Copy of MemRef::ExtractStridedMetadataOp description for reference below. Extracts a base buffer, offset and strides. This op allows additional layers of transformations and foldings to be added as lowering progresses from higher-level dialect to lower-level dialects such as the LLVM dialect.

The op requires a strided memref source operand. If the source operand is not a strided memref, then verification fails.

This operation is also useful for completeness to the existing memref.dim op. While accessing strides, offsets and the base pointer independently is not available, this is useful for composing with its natural complement op: memref.reinterpret_cast.

Intended Use Cases:

The main use case is to expose the logic for manipulate memref metadata at a higher level than the LLVM dialect. This makes lowering more progressive and brings the following benefits: - not all users of MLIR want to lower to LLVM and the information to e.g. lower to library calls---like libxsmm---or to SPIR-V was not available. - foldings and canonicalizations can happen at a higher level in MLIR: before this op existed, lowering to LLVM would create large amounts of LLVMIR. Even when LLVM does a good job at folding the low-level IR from a performance perspective, it is unnecessarily opaque and inefficient to send unkempt IR to LLVM.

Traits: AlwaysSpeculatableImplTrait, InferTypeOpAdaptor, SameVariadicResultSize

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OpAsmOpInterface, ViewLikeOpInterface

Effects: MemoryEffects::Effect{}

Operands:link
Operand Description
source strided memref of any type values
Results:link
Result Description
base_buffer strided memref of any type values of rank 0
offset index
sizes variadic of index
strides variadic of index

iree_codegen.fusion_barrier (Codegen::FusionBarrierOp)link

Prevents fusion through a tensor value

Syntax:

operation ::= `iree_codegen.fusion_barrier` attr-dict $source `:` type($result)

Traits: AlwaysSpeculatableImplTrait, SameOperandsAndResultType

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:link
Operand Description
source ranked tensor of any type values
Results:link
Result Description
result ranked tensor of any type values

iree_codegen.index_hint (Codegen::IndexHintOp)link

Compiler hint providing semantic information about an index

Syntax:

operation ::= `iree_codegen.index_hint` $input `(` $hint `)` attr-dict `:` type($input)

Pure pass-through operation that annotates an index value with semantic information about how it varies across parallel workers (e.g., GPU lanes).

The hint attribute describes the index behavior. Common hints include: - #iree_gpu.lane_constant<N>: Index is uniform within groups of N lanes - #iree_gpu.lane_increment<N>: Index increments by 1 within groups of N lanes

This operation is always safe to remove (replace with input). It exists purely to guide optimization passes.

Example:

%row = iree_codegen.index_hint %idx(#iree_gpu.lane_constant<16>) : index
%col = iree_codegen.index_hint %idx(#iree_gpu.lane_increment<16>) : index

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:link
AttributeMLIR TypeDescription
hint::mlir::Attributeany attribute
Operands:link
Operand Description
input index
Results:link
Result Description
result index

iree_codegen.inner_tiled (Codegen::InnerTiledOp)link

Generic operation on tiled operands based on an intrinsic.

Syntax:

operation ::= `iree_codegen.inner_tiled` `ins` `(` $inputs `)` `outs` `(` $outputs `)` attr-dict
              `:` type($inputs) `into` type($outputs)

Each operand's shape is viewed as outer_dims x inner_tile: - The outer dims are those described by the operand's indexing map; they are iterated by an outer loop nest specified by iterator_types and indexing_maps, with the same conventions as vector.contract. - The remaining trailing dims form the operand's inner tile. At each outer iteration point, the output inner tiles are updated from the input inner tiles by the intrinsic named by kind.

Operands may be tensors, memrefs or vectors. Inner-tile dims are generally static, the only exception being scalable vectors dimensions which correspond to dynamic tensor dimensions.

Attributeslink

  • kind (InnerTileDescAttrInterface): names the intrinsic. It fixes the number of inputs/outputs and, together with semantics, the canonical inner-tile vector type of each operand.
  • semantics (InnerTiledSemanticsAttrInterface): controls how strictly operand inner tiles are checked against kind. Different semantics attributes from different dialects may adopt different conventions, but there are generally two main modes:
  • opaque = true: only the element type and total element count of each inner tile must match the canonical vector type. Shape and rank are otherwise free. Scalable vectors and dynamic tensor dims are not allowed in this mode.
  • Default / opaque = false: after dropping non-scalable unit dims, the operand inner-tile shape must match the (permuted) canonical shape one-to-one. Scalable vector dimensions are supported and must correspond to dynamic tensor dimensions. Every other tensor dimension must be static.
  • indexing_maps: one projected-permutation AffineMap per operand (inputs first, then outputs), all sharing the same number of dims (equal to iterator_types.size()) and no symbols.
  • iterator_types: one of parallel / reduction for each outer iterator dim.
  • permutations (optional): If present, there is one entry per operand, each a permutation vector p of length equal to that operand's inner rank, giving the relationship between the canonical inner-tile shape C (from kind/semantics) and the operand's actual inner-tile shape S: S[i] == C[p[i]].

Exampleslink

Subgroup-level (undistributed) MMA on tensors using the AMD MFMA intrinsic MFMA_F32_32x32x8_F16, whose canonical per-operand inner tiles are LHS = 32x8 (M, K), RHS = 8x32 (K, N) and ACC = 32x32 (M, N). In practice the RHS is typically stored with its inner dims swapped (the matmul is really an MMT at the tile level); this is expressed by the [1, 0] entry in permutations on the RHS operand, which makes its inner-tile shape canonical[1] x canonical[0] = N x K = 32 x 8.

%0 = iree_codegen.inner_tiled ins(%lhs, %rhs) outs(%acc) {
    indexing_maps = [
      affine_map<(m, n, k) -> (m, k)>,
      affine_map<(m, n, k) -> (n, k)>,
      affine_map<(m, n, k) -> (m, n)>
    ],
    iterator_types = [
      #linalg.iterator_type<parallel>,
      #linalg.iterator_type<parallel>,
      #linalg.iterator_type<reduction>
    ],
    kind         = #iree_gpu.mma_layout<MFMA_F32_32x32x8_F16>,
    semantics    = #iree_gpu.mma_semantics<distributed = false, opaque = false>,
    permutations = [array<i64: 0, 1>, array<i64: 1, 0>, array<i64: 0, 1>]
  } : tensor<?x?x32x8xf16>, tensor<?x?x32x8xf16>
      into tensor<?x?x32x32xf32>

After distribution across the 64 lanes of a wave, the per-operand inner tiles become the canonical per-lane shares of the intrinsic: 256 / 64 = 4 f16 elements per lane for LHS and RHS, 1024 / 64 = 16 f32 elements per lane for ACC.

%0 = iree_codegen.inner_tiled ins(%lhs, %rhs) outs(%acc) {
    indexing_maps = [
      affine_map<(m, n, k) -> (m, k)>,
      affine_map<(m, n, k) -> (n, k)>,
      affine_map<(m, n, k) -> (m, n)>
    ],
    iterator_types = [
      #linalg.iterator_type<parallel>,
      #linalg.iterator_type<parallel>,
      #linalg.iterator_type<reduction>
    ],
    kind      = #iree_gpu.mma_layout<MFMA_F32_32x32x8_F16>,
    semantics = #iree_gpu.mma_semantics<distributed = true, opaque = false>
  } : tensor<?x?x4xf16>, tensor<?x?x4xf16>
      into tensor<?x?x16xf32>

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments, InferTypeOpAdaptor

Interfaces: ConditionallySpeculatable, DestinationStyleOpInterface, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), TilingInterface, ValueBoundsOpInterface, VectorUnrollOpInterface

Effects: MemoryEffects::Effect{}

Attributes:link
AttributeMLIR TypeDescription
indexing_maps::mlir::ArrayAttrindexing affine maps
iterator_types::mlir::ArrayAttrIterator type should be an enum.
kindIREE::Codegen::InnerTileDescAttrInterfacebuffer-like constant attribute values
semanticsIREE::Codegen::InnerTiledSemanticsAttrInterfaceAttribute implementing InnerTiledSemanticsAttrInterface
permutations::mlir::ArrayAttrpermutations
Operands:link
Operand Description
inputs variadic of ranked tensor or vector of any type values
outputs variadic of ranked tensor or vector of any type values
Results:link
Result Description
results variadic of ranked tensor or vector of any type values

iree_codegen.load_from_buffer (Codegen::LoadFromBufferOp)link

Loads a tensor from a memref.

Syntax:

operation ::= `iree_codegen.load_from_buffer` $buffer attr-dict `:` type($buffer) `->` type($tensor)

Loads a tensor from a memref with a compatible shape and the same element type.

Interfaces: MemoryEffectOpInterface, ReifyRankedShapedTypeOpInterface

Operands:link
Operand Description
buffer strided memref of any type values
Results:link
Result Description
tensor ranked tensor of any type values

iree_codegen.null_pointer (Codegen::NullPointerOp)link

Returns a null_pointer value.

Syntax:

operation ::= `iree_codegen.null_pointer` attr-dict

This is meant to be used only as arguments to microkernels.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results:link
Result Description
result Pseudo null-pointer type. Lowers to a null pointer.

iree_codegen.query_tile_sizes (Codegen::QueryTileSizesOp)link

Yields tile sizes for the specified tensor type.

Syntax:

operation ::= `iree_codegen.query_tile_sizes` attr-dict $tensor_type `->` type($results)

For targets where tile sizes can't be resolved at compile time, this operation allows querying the sizes at runtime. Today this only applies to VMVX.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:link
AttributeMLIR TypeDescription
tensor_type::mlir::TypeAttrTensor type attribute
Results:link
Result Description
results variadic of index

iree_codegen.smt.assert (Codegen::AssertOp)link

Named SMT assertion for constraint verification.

Syntax:

operation ::= `iree_codegen.smt.assert` $condition `,` $msg (`,` $printArgs^)?
              `:` type($condition) (`,` type($printArgs)^)? attr-dict

Asserts that a boolean SMT expression holds, with a human-readable format string describing what is being checked. The format string may contain {} placeholders that correspond positionally to the args operands. An evaluator or verifier can substitute concrete values into the placeholders for diagnostics.

Args are restricted to !smt.int because all constraint knobs and dimension values are integers.

Used inside iree_codegen.smt.constraints regions.

Examples:

// Static message (no args):
iree_codegen.smt.assert %cond, "dim_0 == 128" : !smt.bool

// Format string with args:
iree_codegen.smt.assert %cond, "wg_x ({}) < wg_y ({})", %x, %y : !smt.bool, !smt.int, !smt.int

Traits: HasParent<ConstraintsOp>

Attributes:link
AttributeMLIR TypeDescription
msg::mlir::StringAttrstring attribute
Operands:link
Operand Description
condition
printArgs variadic of

iree_codegen.smt.constraints (Codegen::ConstraintsOp)link

SMT constraints for a codegen configuration of root ops.

Syntax:

operation ::= `iree_codegen.smt.constraints` `target` `=` $target `,` `pipeline` `=` custom<PipelineAttr>($pipeline) `,`
              custom<KnobsDictionary>($knobs)
              `dims` `(` $problem_dims `)` attr-dict-with-keyword
              $body

Declares SMT constraints over problem dimensions and configuration knobs for a codegen pipeline, targeting a set of root ops.

target: A #iree_codegen.root_op<set = N> attribute identifying which root op set these constraints apply to. All ops marked with the same #iree_codegen.root_op<set = N> attribute share the same lowering config. This decouples constraints from SSA values, so ops with zero or multiple results are supported.

pipeline: The codegen pipeline to use. This is a fixed choice, not decided by the solver.

knobs: DictionaryAttr mirroring GPULoweringConfigAttr. Leaves that are #iree_codegen.smt.int_knob<"name"> attrs name tunable SMT constants (materialized by iree_codegen.smt.knob ops in the body). Integer/attr leaves are fixed.

problem_dims: index-typed problem dimensions; corresponding block arguments are !smt.int.

Does not have any execution semantics and is meant to be used by the tuner or verification passes, and erased before lowering.

Example:

// The matmul is marked: {root_op = #iree_codegen.root_op<set = 0>}
iree_codegen.smt.constraints target = <set = 0>, pipeline = #iree_gpu.pipeline<VectorDistribute>,
 knobs = {workgroup = [#iree_codegen.smt.int_knob<"wg_m">, #iree_codegen.smt.int_knob<"wg_n">]}
 dims(%M, %N, %K) {
^bb0(%m: !smt.int, %n: !smt.int, %k: !smt.int):
  %wg_m = iree_codegen.smt.knob "wg_m" : !smt.int
  %wg_n = iree_codegen.smt.knob "wg_n" : !smt.int
}

Traits: IsolatedFromAbove, NoTerminator, SingleBlock

Attributes:link
AttributeMLIR TypeDescription
target::mlir::iree_compiler::IREE::Codegen::RootOpAttrMarks an operation as a tuner root op in a given set
pipeline::mlir::Attributeidentifier for pass pipeline use to lower dispatch region or PipelineAttrInterface instance
knobs::mlir::DictionaryAttrdictionary of named attribute values
Operands:link
Operand Description
problem_dims variadic of index

iree_codegen.smt.knob (Codegen::KnobOp)link

Declare an SMT constant for a tunable configuration knob.

Syntax:

operation ::= `iree_codegen.smt.knob` $name attr-dict `:` type($result)

Materializes a named SMT constant (!smt.int) for use in constraint expressions. The name must match an #iree_codegen.smt.int_knob<"name"> or #iree_codegen.smt.one_of_knob<"name", [...]> leaf in the enclosing iree_codegen.smt.constraints op's knobs dictionary.

In SMT terminology this is a constant (0-ary function), not a variable. The tuner assigns concrete integer values to these constants.

Example:

%wg_m = iree_codegen.smt.knob "wg_m" : !smt.int

Traits: AlwaysSpeculatableImplTrait, HasParent<ConstraintsOp>

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:link
AttributeMLIR TypeDescription
name::mlir::StringAttrstring attribute
Results:link
Result Description
result

iree_codegen.smt.lookup (Codegen::LookupOp)link

Integer table lookup for SMT constraints.

Syntax:

operation ::= `iree_codegen.smt.lookup` $index $keys `->` $values attr-dict `:` type($result)

Maps an SMT integer index to an integer value via a sparse key-value table. Used to derive values from enumerated knobs (e.g., MMA shape dimensions from an mma_idx knob).

TODO: During constraint verification, the lookup will be evaluated directly. During SMT-LIB export, it will be lowered to a chain of smt.ite ops. Neither pass is implemented yet.

Example:

%mma_m = iree_codegen.smt.lookup %idx [3, 7, 12] -> [16, 32, 64] : !smt.int

Traits: AlwaysSpeculatableImplTrait, HasParent<ConstraintsOp>

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:link
AttributeMLIR TypeDescription
keys::mlir::DenseI64ArrayAttri64 dense array attribute
values::mlir::DenseI64ArrayAttri64 dense array attribute
Operands:link
Operand Description
index
Results:link
Result Description
result

iree_codegen.store_to_buffer (Codegen::StoreToBufferOp)link

Stores a tensor into a memref.

Syntax:

operation ::= `iree_codegen.store_to_buffer` $tensor `,` $buffer
              attr-dict `:` type($tensor) `into` type($buffer)

Stores a tensor into a memref with a compatible shape and the same element type.

Interfaces: MemoryEffectOpInterface

Operands:link
Operand Description
tensor ranked tensor of any type values
buffer strided memref of any type values

iree_codegen.swizzle_hint (Codegen::SwizzleHintOp)link

Hint to swizzle accesses according to an access pattern.

Syntax:

operation ::= `iree_codegen.swizzle_hint` $operand `[` $swizzle attr-dict `]` `:` type($result)

Optimization hint to swizzle all accesses to the memref or tensor that this takes a view of. This only affects reads/writes that immediately consume this operation and is best effort. If the desired swizzling is not apparently possible, this op will no-op. As a result, it should not be relied on for correctness.

Any subviews on this operation will cause the swizzle application to fail. The expectation is for all view like operations to fold into the accessing ops (loads/stores) before this op takes effect.

Note that this only rewrites direct users. If there are any aliased loads or stores of the data from/to the |src| memref of a hintOp, those accesses will not be swizzled. This allows reusing an allocation with different swizzled access patterns as long as there is no data dependency between memory with different layouts. For example:

%0 = alloc()
%1 = iree_codegen.swizzle_hint %0, #layout_0
%2 = iree_codegen.swizzle_hint %0, #layout_1
{
   vector.store %1
   vector.load %1
     ^
     |
    unrelated
     |
     v
   vector.store %2
   vector.load %2
}

If there is a data dependency between the accesses of %1 and %2, for example a value stored to %1 is loaded from %2, this is undefined behavior. Aliasing is otherwise perfectly legal.

Traits: AlwaysSpeculatableImplTrait, SameOperandsAndResultType

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:link
AttributeMLIR TypeDescription
swizzleIREE::Codegen::SwizzleAttrInterfaceswizzling descriptor attributes
Operands:link
Operand Description
operand ranked tensor or memref of any type
Results:link
Result Description
result ranked tensor or memref of any type

iree_codegen.workgroup_count_hint (Codegen::WorkgroupCountHintOp)link

Hints at the workgroup count to set

Captures a set of values to use as the workgroup count. The backward slice starting from this op's operands is cloned into the workgroup count region of all transitive callers.

The sizes are specified in logical order (innermost to outermost), matching the workgroup count region's (x, y, z) convention. If fewer than 3 sizes are provided, the remaining dimensions default to 1.

If multiple hints inform the same entry point, the elementwise maximum across all hints is used as the count along each dimension. For example:

hal.executable.export @entry_point
module {
  func.func @entry_point() {
    iree_codegen.workgroup_count_hint sizes(%a, %b, %c)
    iree_codegen.workgroup_count_hint sizes(%x, %y, %z)
  }
}

resolves to:

hal.executable.export @entry_point {
  %wx = arith.maxsi %a, %x
  %wy = arith.maxsi %b, %y
  %wz = arith.maxsi %c, %z
  hal.return %wx, %wy, %wz
}

Common Usage: Linearized Workgroup Countslink

The most common use case for this operation involves computing a linearized (1-D) workgroup count and specifying it as a single size.

%num_workgroups = arith.ceildivui %total_iterations, %tile_size : index
iree_codegen.workgroup_count_hint sizes(%num_workgroups)

This results in (%num_workgroups, 1, 1) as the final workgroup count.

Attributes:link
AttributeMLIR TypeDescription
static_sizes::mlir::DenseI64ArrayAttri64 dense array attribute
Operands:link
Operand Description
sizes variadic of index

iree_codegen.yield (Codegen::YieldOp)link

IREECodegen yield op.

Syntax:

operation ::= `iree_codegen.yield` attr-dict ($operands^ `:` type($operands))?

iree_codegen.yield is a special terminator operation for blocks inside regions in iree_codegen ops.

Traits: AlwaysSpeculatableImplTrait, ReturnLike, Terminator

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), RegionBranchTerminatorOpInterface

Effects: MemoryEffects::Effect{}

Operands:link
Operand Description
operands variadic of any type

Attributeslink

DenormalFpMathAttrlink

Denormal mode for fp math

Syntax:

#iree_codegen.denormal_fp_math<
  ::mlir::iree_compiler::IREE::Codegen::DenormalFpMath   # value
>

Enum cases: * none (None) * preserve-sign (PreserveSign) * positive-zero (PositiveZero)

Parameters:link
Parameter C++ type Description
value ::mlir::iree_compiler::IREE::Codegen::DenormalFpMath an enum of type DenormalFpMath

DispatchLoweringPassPipelineAttrlink

Identifier for pass pipeline use to lower dispatch region

Syntax:

#iree_codegen.<
  ::mlir::iree_compiler::IREE::Codegen::DispatchLoweringPassPipeline   # value
>
Parameters:link
Parameter C++ type Description
value ::mlir::iree_compiler::IREE::Codegen::DispatchLoweringPassPipeline an enum of type DispatchLoweringPassPipeline

CompilationInfoAttrlink

Drive lowering of an operation from input dialect.

Syntax:

#iree_codegen.compilation_info<
  LoweringConfigAttrInterface,   # loweringConfig
  TranslationInfoAttr   # translationInfo
>

Specifies the information that allows controlling the compilation of operations like linalg.matmul/linalg.*conv within IREE. This information is used to override the defaults used by the IREE compiler. If set on the input to the compiler, there is no guarantee that the config survives until codegen. Named operations like linalg.matmul/linalg.*conv* are more likely to retain their lowering configurations.

TODO: It is expected that the TranslationInfoAttr and the LoweringConfigAttr are specified. Currently there is no verification that the values of the LoweringConfigAttr fully specifies the behaviour of the compilation path chosen with TranslationInfoAttr. This could be added in the future.

Parameters:link
Parameter C++ type Description
loweringConfig LoweringConfigAttrInterface
translationInfo TranslationInfoAttr

ExportConfigAttrlink

User defined workgroup size specification.

Syntax:

#iree_codegen.export_config<
  ::llvm::ArrayRef<int64_t>   # workgroup_size
>

Allows setting workgroup size for pre-formed dispatches.

Parameters:link
Parameter C++ type Description
workgroup_size ::llvm::ArrayRef<int64_t> Workgroup Size to use

IntKnobAttrlink

Integer-valued tunable knob placeholder.

Syntax:

#iree_codegen.smt.int_knob<
  StringAttr   # name
>

Represents a named placeholder for an integer tunable parameter in a constraints knobs dictionary. During constraint generation, these appear in tiling arrays (workgroup, reduction, thread), workgroup_size, and subgroup_size positions. The name matches the corresponding iree_codegen.smt.knob op name.

Parameters:link
Parameter C++ type Description
name StringAttr

LoweringConfigAttrlink

Drive lowering of an operation within dispatch region.

Syntax:

#iree_codegen.lowering_config<
  LoweringConfigTilingLevelsAttr   # tilingLevels
>

Default implementation of a lowering configuration attribute. It includes only tiling and optionally vectorization information. The interpretation of the tiles sizes are backend dependent.

TODO: Currently there is no verification that the configuration specifies everything needed for a pass-pipeline. The values to set for these parameters is dependent on the pass-pipeline implementation. In future, each pass pipeline could verify that the lowering configuration has all the necessary attributes for the pipeline.

Parameters:link
Parameter C++ type Description
tilingLevels LoweringConfigTilingLevelsAttr The lowering config at different levels

LoweringConfigTilingLevelAttrlink

Parameters:link
Parameter C++ type Description
sizes ::llvm::ArrayRef<int64_t> The tile sizes to use for this level of tiling
interchange ::llvm::ArrayRef<int64_t> The tile interchange to use for this level of tiling
scalableFlags ::llvm::ArrayRef<bool> The scalable tile flags for this level of tiling

LoweringConfigTilingLevelsAttrlink

Syntax:

#iree_codegen.lowering_config_levels<
  ::llvm::ArrayRef<LoweringConfigTilingLevelAttr>   # value
>
Parameters:link
Parameter C++ type Description
value ::llvm::ArrayRef<LoweringConfigTilingLevelAttr>

OneOfKnobAttrlink

Select-from-enumerated-set knob placeholder.

Syntax:

#iree_codegen.smt.one_of_knob<
  StringAttr,   # name
  ArrayAttr   # options
>

Represents a named placeholder for a discrete choice from a fixed set of attributes. The SMT variable is an integer index into the options array. During extraction, the actual attribute is matched against the options to find its index. During materialization, the index selects directly from the array -- no reverse lookup needed.

Parameters:link
Parameter C++ type Description
name StringAttr
options ArrayAttr

PassPipelineAttrlink

An attribute carrying a textual pass pipeline string.

Syntax:

#iree_codegen.pass_pipeline<
  ::llvm::StringRef   # pipeline
>

Specifies a pass pipeline using MLIR's textual pass pipeline syntax. The pipeline string is parsed and populated into an OpPassManager when buildPipeline is called.

Parameters:link
Parameter C++ type Description
pipeline ::llvm::StringRef The textual pass pipeline specification

RootOpAttrlink

Marks an operation as a tuner root op in a given set

Syntax:

#iree_codegen.root_op<
  int64_t   # set
>

Attribute identifying a root operation. The set parameter groups root ops into numbered sets: all root ops in the given set share the same lowering_config. Codegen does not rely on the root_op attribute; it is only used for constraint generation when tuning.

Parameters:link
Parameter C++ type Description
set int64_t

RotateRowsAttrlink

An attribute that describes a swizzling pattern for rotating rows.

Syntax:

#iree_codegen.rotate_rows<
  int64_t,   # row_width
  int64_t   # access_width
>

This attribute rotates accesses of |access_width| within rows of size |row_width|. For any given access into logical memref of shape memref<...xNx|access_width|x!eltype> where N = row_width / access_width at position (i, j, 0) is rotated to (i, (i + j) % N, 0). For example,

row_width = 16, access_width = 4

0000 1111 2222 3333 /// 0 1 2 3
4444 5555 6666 7777 /// 0 1 2 3
8888 9999 AAAA BBBB /// 0 1 2 3
CCCC DDDD EEEE FFFF /// 0 1 2 3

is swizzled to

0000 1111 2222 3333 /// 0 1 2 3
7777 4444 5555 6666 /// 3 0 1 2
AAAA BBBB 8888 9999 /// 2 3 0 1
DDDD EEEE FFFF CCCC /// 1 2 3 0

The pattern repeats for subsequent rows.

Parameters:link
Parameter C++ type Description
row_width int64_t
access_width int64_t

SimpleTargetAttrlink

Default implementation of TargetInfoAttrInterface

Syntax:

#iree_codegen.simple_target<
  ::llvm::ArrayRef<int64_t>   # max_workgroup_count
>

Implements the basic information needed to satisfy the TargetInfoAttrInterface

Parameters:link
Parameter C++ type Description
max_workgroup_count ::llvm::ArrayRef<int64_t> Maximum allowed workgroup count in [x, y, z]

SymbolicUKernelProviderAttrlink

An attribute that provides ukernel implementations by looking up the nearest symbol table from the annotation site.

Syntax: #iree_codegen.symbolic_ukernel_provider

TranslationInfoAttrlink

Drive dispatch entry point lowering.

Syntax:

#iree_codegen.translation_info<
  Attribute,   # passPipeline
  SymbolRefAttr,   # codegenSpec
  ::llvm::ArrayRef<int64_t>,   # workgroupSize
  int64_t,   # subgroupSize
  DictionaryAttr   # configuration
>

Specifies the information that is used to drive the translation of an entry point function using Linalg based structured-op lowering. During executable translation this is attached to the hal.executable.export operation.

If this operation is already set on the root operation (as part of iree_codegen.compilation_info) that drives the compilation of a dispatch region (like linalg.matmul/linalg.*conv*), this attribute gets propagated to the entry point function.

The passPipeline field can be either: - A DispatchLoweringPassPipelineAttr (enum keyword like CPUDefault). - Any attribute implementing PipelineAttrInterface (e.g., #iree_codegen.pass_pipeline<"...">).

Parameters:link
Parameter C++ type Description
passPipeline Attribute Pass pipeline specification.
codegenSpec SymbolRefAttr The symbol pointing to the transform dialect codegen spec to be used
workgroupSize ::llvm::ArrayRef<int64_t> The workgroup size to use
subgroupSize int64_t The subgroup size to use
configuration DictionaryAttr Pipeline specific configuration

UKernelDescriptorAttrlink

An attribute that specifies the ukernel implementation based on the |ukernel_name| and the integration point based on the |kind|. Example integration points could for example be at the tensor, memref or bitcode level.

Syntax:

#iree_codegen.ukernel_descriptor<
  ::llvm::StringRef,   # ukernel_name
  ::mlir::iree_compiler::IREE::Codegen::UKernelArgumentKind   # kind
>
Parameters:link
Parameter C++ type Description
ukernel_name ::llvm::StringRef
kind ::mlir::iree_compiler::IREE::Codegen::UKernelArgumentKind an enum of type UKernelArgumentKind

XORShuffleAttrlink

An attribute that describes an XOR-based swizzling pattern.

Syntax:

#iree_codegen.xor_shuffle<
  int64_t,   # row_width
  int64_t,   # access_width
  int64_t,   # row_stride
  int64_t   # per_phase
>

Shuffles accesses of |access_width| within rows of size |row_width|. For any given access into logical memref of shape memref<...xNx|access_width|x!eltype> where N = row_width / access_width at position (i, j, 0) is shuffled to (i, ((i/per_phase) %N) XOR j , 0). For example,

row_width = 16, access_width = 4, per_phase = 1

0000 1111 2222 3333 /// 0 1 2 3
4444 5555 6666 7777 /// 0 1 2 3
8888 9999 AAAA BBBB /// 0 1 2 3
CCCC DDDD EEEE FFFF /// 0 1 2 3

is swizzled to

0000 1111 2222 3333 /// 0 1 2 3
5555 4444 7777 6666 /// 1 0 3 2
AAAA BBBB 8888 9999 /// 2 3 0 1
FFFF EEEE DDDD CCCC /// 3 2 1 0
|per_phase| allows to keep the same shuffling across multiple rows. For example,

row_width = 16, access_width = 4, per_phase = 2

0000 1111 2222 3333 /// 0 1 2 3
4444 5555 6666 7777 /// 0 1 2 3
8888 9999 AAAA BBBB /// 0 1 2 3
CCCC DDDD EEEE FFFF /// 0 1 2 3

is swizzled to

0000 1111 2222 3333 /// 0 1 2 3
4444 5555 6666 7777 /// 0 1 2 3
9999 8888 BBBB AAAA /// 1 0 3 2
DDDD CCCC FFFF EEEE /// 1 0 3 2

The pattern repeats for subsequent rows.

Parameters:link
Parameter C++ type Description
row_width int64_t
access_width int64_t
row_stride int64_t row stride. Default to row_width
per_phase int64_t Default to 1

LocalMappingAttrlink

Syntax:

#iree_codegen.local_mapping<
  int64_t   # dim
>

Attribute for mapping scf.forall loops to sequential local execution.

Example:

  scf.forall (%i, %j) in (4, 8) {
    // body ... %i, %j
  } {mapping = [#iree_codegen.local_mapping<1>,
                #iree_codegen.local_mapping<0>]}

This forall approximately corresponds to the following loop nest:

  scf.for %i = 0 to 4 {
    scf.for %j = 0 to 8 {
    // body ... %i, %j
    }
  }
Parameters:link
Parameter C++ type Description
dim int64_t

WorkgroupMappingAttrlink

Syntax:

#iree_codegen.workgroup_mapping<
  ::mlir::iree_compiler::IREE::Codegen::WorkgroupId,   # id
  int64_t   # delinearizedDim
>

Attribute that eventually will be used to map distributed loop iterations to hal.workgroup.ids.

The x,y and z values for id map to hal.workgroup.id[0], hal.workgroup.id[1] and hal.workgroup.id[2] respectively.

In addition it is possible to specify if the z dimension is to be delinearized on mapping. For example if the list of mapping attributes is [workgroup_mapping<z:1>, workgroup_mapping<z:0>], then the z dimension is delinearized to map to workgroup_mapping<z:1> and workgroup_mapping<z:0>. In other words if the number of logical parallel workers along the z:0 dimension is W, then

workgroup_mapping<z:0> = hal.workgroup.id[1] mod W,
worgrkoup_mapping<z:1> = hal.workgroup.id[1] div W

Note: It is expected that this attribute is always used in a list of mapping attributes (with a single element being a list of size 1). It is illegal for a list to have workgroup_mapping<z:a> without workgroup_mapping<z:b> if a > b. In the same way it is illegal to for the list to - have workgroup_mapping<y> but not workgroup_mapping<x> - have workgroup_mapping<z:*> but not have workgroup_mapping<x> and workgroup_mapping<y>

Parameters:link
Parameter C++ type Description
id ::mlir::iree_compiler::IREE::Codegen::WorkgroupId an enum of type WorkgroupId
delinearizedDim int64_t

WorkgroupScopeAttrlink

Attribute representing parallel execution across workgroups.

Parallel scope that maps to workgroup execution.

If |linearize| is set to true, workgroup ids are linearized into a single id which is then typically delinearized based on the problem space.

Parameters:link
Parameter C++ type Description
linearize bool

Typeslink

NullPointerTypelink

Pseudo null-pointer type. Lowers to a null pointer.

Syntax: !iree_codegen.null_pointer

This is meant to be used only as arguments to microkernels.

Enumslink

BinaryFnlink

Allowed 32-bit signless integer cases: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

Cases:link

Symbol Value String
add 0 add
sub 1 sub
mul 2 mul
div 3 div
div_unsigned 4 div_unsigned
max_signed 5 max_signed
min_signed 6 min_signed
max_unsigned 7 max_unsigned
min_unsigned 8 min_unsigned
powf 9 powf

DenormalFpMathlink

Denormal mode for fp math

Cases:link

Symbol Value String
None 0 none
PreserveSign 1 preserve-sign
PositiveZero 2 positive-zero

DispatchLoweringPassPipelinelink

Identifier for pass pipeline use to lower dispatch region

Cases:link

Symbol Value String
VMVXDefault 300 VMVXDefault
TransformDialectCodegen 1000 TransformDialectCodegen
Custom 1001 Custom
None 65535 None

ElementwiseArityGrouplink

Allowed 32-bit signless integer cases: 1, 2, 3

Cases:link

Symbol Value String
Unary 1 Unary
Binary 2 Binary
Ternary 3 Ternary

ElementwiseCaseLimitslink

Allowed 32-bit signless integer cases:

Cases:link

Symbol Value String
LastUnary 13 LastUnary
LastBinary 23 LastBinary
LastTernary 24 LastTernary

ElementwiseKindlink

Allowed 32-bit signless integer cases: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23

Cases:link

Symbol Value String
exp 0 exp
log 1 log
abs 2 abs
ceil 3 ceil
floor 4 floor
negf 5 negf
reciprocal 6 reciprocal
round 7 round
sqrt 8 sqrt
rsqrt 9 rsqrt
square 10 square
tanh 11 tanh
erf 12 erf
add 13 add
sub 14 sub
mul 15 mul
div 16 div
div_unsigned 17 div_unsigned
max_signed 18 max_signed
min_signed 19 min_signed
max_unsigned 20 max_unsigned
min_unsigned 21 min_unsigned
powf 22 powf
select 23 select

IteratorTypelink

Iterator type

Cases:link

Symbol Value String
parallel 0 parallel
reduction 1 reduction

TernaryFnlink

Allowed 32-bit signless integer cases: 0

Cases:link

Symbol Value String
select 0 select

TypeFnlink

Allowed 32-bit signless integer cases: 0, 1

Cases:link

Symbol Value String
cast_signed 0 cast_signed
cast_unsigned 1 cast_unsigned

UKernelArgumentKindlink

Attribute describing the ukernel integration point

Cases:link

Symbol Value String
Tensor 0 tensor
Memref 1 memref
Bitcode 2 bitcode

UnaryFnlink

Allowed 32-bit signless integer cases: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12

Cases:link

Symbol Value String
exp 0 exp
log 1 log
abs 2 abs
ceil 3 ceil
floor 4 floor
negf 5 negf
reciprocal 6 reciprocal
round 7 round
sqrt 8 sqrt
rsqrt 9 rsqrt
square 10 square
tanh 11 tanh
erf 12 erf

WinogradConv2DFmrlink

Allowed 32-bit signless integer cases: 0, 1, 2

Cases:link

Symbol Value String
F_2_3 0 F_2_3
F_4_3 1 F_4_3
F_2_5 2 F_2_5

WorkgroupIdlink

Attribute that map to hal.workgrpoup.ids

Cases:link

Symbol Value String
IdX 0 x
IdY 1 y
IdZ 2 z