Skip to content

'iree_cpu' Dialectlink

A dialect for common functionality used by CPU focused IREE code generation.

This dialect provides operations and attributes to aid in code generation for CPU targets. The functionality in this dialect can be hardware specific, but is intended to be independent of the lowering target. Late lowerings to LLVM are handled separately.

Attributeslink

CPUEncodingResolverAttrlink

The encoding layout attribute for CPU backends.

Syntax:

#iree_cpu.cpu_encoding_resolver<
  DictionaryAttr   # configuration
>

This attribute can implement any layout interface methods for encoding serialization and or materialization, e.g., Encoding::LayoutMaterializerAttr, Codegen::PackedLayoutMaterializerAttr, etc. They are implemented through external model mechanism See the implementation in compiler/Codegen/ExternalInterfaces/*.

Parameters:link
Parameter C++ type Description
configuration DictionaryAttr Executable target configuration. It is expected to be used in a pass scope, but not the final IR output.

DataTiledMMAAttrlink

Syntax:

#iree_cpu.data_tiled_mma_layout<
  ::mlir::iree_compiler::IREE::CPU::MMAIntrinsic,   # intrinsic
  int64_t,   # intrinsics_m
  int64_t,   # intrinsics_n
  int64_t,   # intrinsics_k
  ::mlir::Type,   # lhs_type
  ::mlir::Type,   # rhs_type
  ::mlir::Type   # acc_type
>

CPU analogue of IREEGPU_DataTiledMMAAttr, for use with iree_codegen.inner_tiled. Like the GPU case, this wraps an intrinsic-enum and some intrinsics_{m,n,k} unrolling factor. Unlike the GPU case, there is no thread-distribution, no concept of subgroups and no interleaving of intrinsics' layout.

Each non-square hardware MMA appears as two MMAIntrinsic enum values — one per orientation (e.g. MMA_X86_AVX512_1x16x1_F32_F32 and its M↔N-swapped sibling MMA_X86_AVX512_16x1x1_F32_F32). The cost model treats them as distinct candidates and picks whichever fits the matmul shape better.

For most intrinsic values the (LHS, RHS, ACC) element types are baked into the enum and lhs_type / rhs_type / acc_type are unused. The one exception is MMA_GENERIC_SCALAR_1x1x1: it is a type-polymorphic fallback used when no element-type-specific intrinsic matches the target, and it carries its element types in those three optional parameters instead. This deliberately breaks the otherwise-strong invariant that an MMAIntrinsic enum value pins down a specific element type triple, in exchange for not having to add one enum value per supported (LHS, RHS, ACC) combination.

Some GPU-specific methods in IREECodegen_InnerTileDescAttrInterface are left here but are unused.

Parameters:link
Parameter C++ type Description
intrinsic ::mlir::iree_compiler::IREE::CPU::MMAIntrinsic an enum of type MMAIntrinsic
intrinsics_m int64_t Intrinsic count along the M dimension.
intrinsics_n int64_t Intrinsic count along the N dimension.
intrinsics_k int64_t Intrinsic count along the K dimension.
lhs_type ::mlir::Type LHS element type, used only by type-polymorphic intrinsics such as MMA_GENERIC_SCALAR_1x1x1.
rhs_type ::mlir::Type RHS element type, used only by type-polymorphic intrinsics such as MMA_GENERIC_SCALAR_1x1x1.
acc_type ::mlir::Type ACC element type, used only by type-polymorphic intrinsics such as MMA_GENERIC_SCALAR_1x1x1.

InnerTiledSemanticsAttrlink

Syntax: #iree_cpu.mma_semantics

Attribute describing aspects of inner-tiled MMA semantics that are orthogonal to the data_tiled_mma_layout kind. On CPU, tiles are always undistributed (no thread distribution) and always expanded (opaque = false), so there is currently no parameter here, making this temporarily a unit attribute, but this could evolve in the future to look more like IREEGPU_InnerTiledSemanticsAttr.

LoweringConfigAttrlink

Drive lowering of an operation for cpu compilation.

CPU specific implementation of a lowering config. This carries just a dictionary attribute to store any relevant fields. This is the simplest form of a lowering config, offering flexibility at the cost of structure.

For some key entries, e.g., distribution, etc., they must be IREE::Codegen::LoweringConfigTilingLevelAttr, which is a list of tile sizes with optional scalable representation like vector types. E.g.,

#iree_cpu.lowering_config< distribution = [128, 128, 0], cache_parallel = [64, 64, 0], cache_reduction = [0, 0, 16], vector_common_parallel = [[4], [4], 0], vector_reduction = [0, 0, [4]], vector_inner_parallel = [0, 0, 0]

For more details, see the implementation in IREECPUAttrs.cpp.

Note that it is undefined if more than one of vector tiling levels set a value on a dimension. They are expected to be disjoint. It is not enforced in the verifier, because we want to keep the flexibility when something is wrong in a lowering config. E.g., some transformations still work even if they are not disjoint.

Parameters:link
Parameter C++ type Description
config DictionaryAttr The configured fields, including tiling levels.

MMAIntrinsicAttrlink

Descriptor for different MMA intrinsics

Syntax:

#iree_cpu.mma_intrinsic<
  ::mlir::iree_compiler::IREE::CPU::MMAIntrinsic   # value
>
Parameters:link
Parameter C++ type Description
value ::mlir::iree_compiler::IREE::CPU::MMAIntrinsic an enum of type MMAIntrinsic

PipelineAttrlink

CPU lowering pipeline identifier.

Syntax:

#iree_cpu.pipeline<
  ::mlir::iree_compiler::IREE::CPU::LoweringPipeline   # value
>

Identifies a CPU lowering pipeline. Implements PipelineAttrInterface by delegating to a builder callback registered via registerCPUPipelineBuilder(). The builder must handle all LoweringPipeline enum values.

Parameters:link
Parameter C++ type Description
value ::mlir::iree_compiler::IREE::CPU::LoweringPipeline an enum of type LoweringPipeline

VMVXEncodingResolverAttrlink

The encoding layout attribute for VMVX backend.

Syntax:

#iree_cpu.vmvx_encoding_resolver<
  DictionaryAttr   # configuration
>

This attribute can implement any layout interface methods for encoding serialization and or materialization, e.g., Encoding::LayoutMaterializerAttr, Codegen::PackedLayoutMaterializerAttr, etc. They are implemented through external model mechanism See the implementation in compiler/Codegen/ExternalInterfaces/*.

Parameters:link
Parameter C++ type Description
configuration DictionaryAttr Executable target configuration. It is expected to be used in a pass scope, but not the final IR output.

Enumslink

LoweringPipelinelink

LLVMCPU lowering pipeline identifier

Cases:link

Symbol Value String
Default 0 Default
DoubleTilingExpert 1 DoubleTilingExpert
ConvTileAndDecomposeExpert 2 ConvTileAndDecomposeExpert
Mmt4dTilingExpert 3 Mmt4dTilingExpert
BufferOpsTileAndVectorize 4 BufferOpsTileAndVectorize
DataTiling 5 DataTiling
LinalgExtTileAndVectorize 6 LinalgExtTileAndVectorize

MMAIntrinsiclink

Descriptor for different MMA intrinsics

Cases:link

Symbol Value String
None 0 None
MMA_X86_AVX2_FMA_1x8x1_F32_F32 4624 MMA_X86_AVX2_FMA_1x8x1_F32_F32
MMA_X86_AVX2_FMA_8x1x1_F32_F32 4625 MMA_X86_AVX2_FMA_8x1x1_F32_F32
MMA_X86_AVX512_1x8x1_F64_F64 4864 MMA_X86_AVX512_1x8x1_F64_F64
MMA_X86_AVX512_8x1x1_F64_F64 4865 MMA_X86_AVX512_8x1x1_F64_F64
MMA_X86_AVX512_1x16x1_F32_F32 4880 MMA_X86_AVX512_1x16x1_F32_F32
MMA_X86_AVX512_16x1x1_F32_F32 4881 MMA_X86_AVX512_16x1x1_F32_F32
MMA_X86_AVX512_1x16x1_F32_F16_CASTF32 4896 MMA_X86_AVX512_1x16x1_F32_F16_CASTF32
MMA_X86_AVX512_16x1x1_F32_F16_CASTF32 4897 MMA_X86_AVX512_16x1x1_F32_F16_CASTF32
MMA_X86_AVX512FP16_1x32x1_F16_F16 4898 MMA_X86_AVX512FP16_1x32x1_F16_F16
MMA_X86_AVX512FP16_32x1x1_F16_F16 4899 MMA_X86_AVX512FP16_32x1x1_F16_F16
MMA_X86_AVX512BF16_1x16x2_F32_BF16 4912 MMA_X86_AVX512BF16_1x16x2_F32_BF16
MMA_X86_AVX512BF16_16x1x2_F32_BF16 4913 MMA_X86_AVX512BF16_16x1x2_F32_BF16
MMA_X86_AVX512_1x16x2_I32_I16 5024 MMA_X86_AVX512_1x16x2_I32_I16
MMA_X86_AVX512_16x1x2_I32_I16 5025 MMA_X86_AVX512_16x1x2_I32_I16
MMA_X86_AVX512VNNI_1x16x2_I32_I16 5026 MMA_X86_AVX512VNNI_1x16x2_I32_I16
MMA_X86_AVX512VNNI_16x1x2_I32_I16 5027 MMA_X86_AVX512VNNI_16x1x2_I32_I16
MMA_X86_AVX512_1x16x2_I32_I8_CASTI16 5056 MMA_X86_AVX512_1x16x2_I32_I8_CASTI16
MMA_X86_AVX512_16x1x2_I32_I8_CASTI16 5057 MMA_X86_AVX512_16x1x2_I32_I8_CASTI16
MMA_X86_AVX512VNNI_1x16x2_I32_I8_CASTI16 5058 MMA_X86_AVX512VNNI_1x16x2_I32_I8_CASTI16
MMA_X86_AVX512VNNI_16x1x2_I32_I8_CASTI16 5059 MMA_X86_AVX512VNNI_16x1x2_I32_I8_CASTI16
MMA_ARM_SVE_FMLA_1x4VLx1_F32_F32 8720 MMA_ARM_SVE_FMLA_1x4VLx1_F32_F32
MMA_ARM_SVE_FMLA_4VLx1x1_F32_F32 8721 MMA_ARM_SVE_FMLA_4VLx1x1_F32_F32
MMA_GENERIC_SCALAR_1x1x1_REG8 61448 MMA_GENERIC_SCALAR_1x1x1_REG8
MMA_GENERIC_SCALAR_1x1x1_REG16 61456 MMA_GENERIC_SCALAR_1x1x1_REG16