'iree_cpu' Dialectlink
A dialect for common functionality used by CPU focused IREE code generation.
This dialect provides operations and attributes to aid in code generation for CPU targets. The functionality in this dialect can be hardware specific, but is intended to be independent of the lowering target. Late lowerings to LLVM are handled separately.
Attributeslink
CPUEncodingResolverAttrlink
The encoding layout attribute for CPU backends.
Syntax:
#iree_cpu.cpu_encoding_resolver<
DictionaryAttr # configuration
>
This attribute can implement any layout interface methods for encoding serialization and or materialization, e.g., Encoding::LayoutMaterializerAttr, Codegen::PackedLayoutMaterializerAttr, etc. They are implemented through external model mechanism See the implementation in compiler/Codegen/ExternalInterfaces/*.
Parameters:link
| Parameter | C++ type | Description |
|---|---|---|
| configuration | DictionaryAttr |
Executable target configuration. It is expected to be used in a pass scope, but not the final IR output. |
DataTiledMMAAttrlink
Syntax:
#iree_cpu.data_tiled_mma_layout<
::mlir::iree_compiler::IREE::CPU::MMAIntrinsic, # intrinsic
int64_t, # intrinsics_m
int64_t, # intrinsics_n
int64_t, # intrinsics_k
::mlir::Type, # lhs_type
::mlir::Type, # rhs_type
::mlir::Type # acc_type
>
CPU analogue of IREEGPU_DataTiledMMAAttr, for use with iree_codegen.inner_tiled. Like the GPU case, this wraps an intrinsic-enum and some intrinsics_{m,n,k} unrolling factor. Unlike the GPU case, there is no thread-distribution, no concept of subgroups and no interleaving of intrinsics' layout.
Each non-square hardware MMA appears as two MMAIntrinsic enum values —
one per orientation (e.g. MMA_X86_AVX512_1x16x1_F32_F32 and its
M↔N-swapped sibling MMA_X86_AVX512_16x1x1_F32_F32). The cost model
treats them as distinct candidates and picks whichever fits the matmul
shape better.
For most intrinsic values the (LHS, RHS, ACC) element types are baked
into the enum and lhs_type / rhs_type / acc_type are unused. The
one exception is MMA_GENERIC_SCALAR_1x1x1: it is a type-polymorphic
fallback used when no element-type-specific intrinsic matches the
target, and it carries its element types in those three optional
parameters instead. This deliberately breaks the otherwise-strong
invariant that an MMAIntrinsic enum value pins down a specific element
type triple, in exchange for not having to add one enum value per
supported (LHS, RHS, ACC) combination.
Some GPU-specific methods in IREECodegen_InnerTileDescAttrInterface are left here but are unused.
Parameters:link
| Parameter | C++ type | Description |
|---|---|---|
| intrinsic | ::mlir::iree_compiler::IREE::CPU::MMAIntrinsic |
an enum of type MMAIntrinsic |
| intrinsics_m | int64_t |
Intrinsic count along the M dimension. |
| intrinsics_n | int64_t |
Intrinsic count along the N dimension. |
| intrinsics_k | int64_t |
Intrinsic count along the K dimension. |
| lhs_type | ::mlir::Type |
LHS element type, used only by type-polymorphic intrinsics such as MMA_GENERIC_SCALAR_1x1x1. |
| rhs_type | ::mlir::Type |
RHS element type, used only by type-polymorphic intrinsics such as MMA_GENERIC_SCALAR_1x1x1. |
| acc_type | ::mlir::Type |
ACC element type, used only by type-polymorphic intrinsics such as MMA_GENERIC_SCALAR_1x1x1. |
InnerTiledSemanticsAttrlink
Syntax: #iree_cpu.mma_semantics
Attribute describing aspects of inner-tiled MMA semantics that are orthogonal to the data_tiled_mma_layout kind. On CPU, tiles are always undistributed (no thread distribution) and always expanded (opaque = false), so there is currently no parameter here, making this temporarily a unit attribute, but this could evolve in the future to look more like IREEGPU_InnerTiledSemanticsAttr.
LoweringConfigAttrlink
Drive lowering of an operation for cpu compilation.
CPU specific implementation of a lowering config. This carries just a dictionary attribute to store any relevant fields. This is the simplest form of a lowering config, offering flexibility at the cost of structure.
For some key entries, e.g., distribution, etc., they must be IREE::Codegen::LoweringConfigTilingLevelAttr, which is a list of tile sizes with optional scalable representation like vector types. E.g.,
#iree_cpu.lowering_config< distribution = [128, 128, 0], cache_parallel = [64, 64, 0], cache_reduction = [0, 0, 16], vector_common_parallel = [[4], [4], 0], vector_reduction = [0, 0, [4]], vector_inner_parallel = [0, 0, 0]
For more details, see the implementation in IREECPUAttrs.cpp.
Note that it is undefined if more than one of vector tiling levels set a value on a dimension. They are expected to be disjoint. It is not enforced in the verifier, because we want to keep the flexibility when something is wrong in a lowering config. E.g., some transformations still work even if they are not disjoint.
Parameters:link
| Parameter | C++ type | Description |
|---|---|---|
| config | DictionaryAttr |
The configured fields, including tiling levels. |
MMAIntrinsicAttrlink
Descriptor for different MMA intrinsics
Syntax:
#iree_cpu.mma_intrinsic<
::mlir::iree_compiler::IREE::CPU::MMAIntrinsic # value
>
Parameters:link
| Parameter | C++ type | Description |
|---|---|---|
| value | ::mlir::iree_compiler::IREE::CPU::MMAIntrinsic |
an enum of type MMAIntrinsic |
PipelineAttrlink
CPU lowering pipeline identifier.
Syntax:
#iree_cpu.pipeline<
::mlir::iree_compiler::IREE::CPU::LoweringPipeline # value
>
Identifies a CPU lowering pipeline. Implements PipelineAttrInterface by delegating to a builder callback registered via registerCPUPipelineBuilder(). The builder must handle all LoweringPipeline enum values.
Parameters:link
| Parameter | C++ type | Description |
|---|---|---|
| value | ::mlir::iree_compiler::IREE::CPU::LoweringPipeline |
an enum of type LoweringPipeline |
VMVXEncodingResolverAttrlink
The encoding layout attribute for VMVX backend.
Syntax:
#iree_cpu.vmvx_encoding_resolver<
DictionaryAttr # configuration
>
This attribute can implement any layout interface methods for encoding serialization and or materialization, e.g., Encoding::LayoutMaterializerAttr, Codegen::PackedLayoutMaterializerAttr, etc. They are implemented through external model mechanism See the implementation in compiler/Codegen/ExternalInterfaces/*.
Parameters:link
| Parameter | C++ type | Description |
|---|---|---|
| configuration | DictionaryAttr |
Executable target configuration. It is expected to be used in a pass scope, but not the final IR output. |
Enumslink
LoweringPipelinelink
LLVMCPU lowering pipeline identifier
Cases:link
| Symbol | Value | String |
|---|---|---|
| Default | 0 |
Default |
| DoubleTilingExpert | 1 |
DoubleTilingExpert |
| ConvTileAndDecomposeExpert | 2 |
ConvTileAndDecomposeExpert |
| Mmt4dTilingExpert | 3 |
Mmt4dTilingExpert |
| BufferOpsTileAndVectorize | 4 |
BufferOpsTileAndVectorize |
| DataTiling | 5 |
DataTiling |
| LinalgExtTileAndVectorize | 6 |
LinalgExtTileAndVectorize |
MMAIntrinsiclink
Descriptor for different MMA intrinsics
Cases:link
| Symbol | Value | String |
|---|---|---|
| None | 0 |
None |
| MMA_X86_AVX2_FMA_1x8x1_F32_F32 | 4624 |
MMA_X86_AVX2_FMA_1x8x1_F32_F32 |
| MMA_X86_AVX2_FMA_8x1x1_F32_F32 | 4625 |
MMA_X86_AVX2_FMA_8x1x1_F32_F32 |
| MMA_X86_AVX512_1x8x1_F64_F64 | 4864 |
MMA_X86_AVX512_1x8x1_F64_F64 |
| MMA_X86_AVX512_8x1x1_F64_F64 | 4865 |
MMA_X86_AVX512_8x1x1_F64_F64 |
| MMA_X86_AVX512_1x16x1_F32_F32 | 4880 |
MMA_X86_AVX512_1x16x1_F32_F32 |
| MMA_X86_AVX512_16x1x1_F32_F32 | 4881 |
MMA_X86_AVX512_16x1x1_F32_F32 |
| MMA_X86_AVX512_1x16x1_F32_F16_CASTF32 | 4896 |
MMA_X86_AVX512_1x16x1_F32_F16_CASTF32 |
| MMA_X86_AVX512_16x1x1_F32_F16_CASTF32 | 4897 |
MMA_X86_AVX512_16x1x1_F32_F16_CASTF32 |
| MMA_X86_AVX512FP16_1x32x1_F16_F16 | 4898 |
MMA_X86_AVX512FP16_1x32x1_F16_F16 |
| MMA_X86_AVX512FP16_32x1x1_F16_F16 | 4899 |
MMA_X86_AVX512FP16_32x1x1_F16_F16 |
| MMA_X86_AVX512BF16_1x16x2_F32_BF16 | 4912 |
MMA_X86_AVX512BF16_1x16x2_F32_BF16 |
| MMA_X86_AVX512BF16_16x1x2_F32_BF16 | 4913 |
MMA_X86_AVX512BF16_16x1x2_F32_BF16 |
| MMA_X86_AVX512_1x16x2_I32_I16 | 5024 |
MMA_X86_AVX512_1x16x2_I32_I16 |
| MMA_X86_AVX512_16x1x2_I32_I16 | 5025 |
MMA_X86_AVX512_16x1x2_I32_I16 |
| MMA_X86_AVX512VNNI_1x16x2_I32_I16 | 5026 |
MMA_X86_AVX512VNNI_1x16x2_I32_I16 |
| MMA_X86_AVX512VNNI_16x1x2_I32_I16 | 5027 |
MMA_X86_AVX512VNNI_16x1x2_I32_I16 |
| MMA_X86_AVX512_1x16x2_I32_I8_CASTI16 | 5056 |
MMA_X86_AVX512_1x16x2_I32_I8_CASTI16 |
| MMA_X86_AVX512_16x1x2_I32_I8_CASTI16 | 5057 |
MMA_X86_AVX512_16x1x2_I32_I8_CASTI16 |
| MMA_X86_AVX512VNNI_1x16x2_I32_I8_CASTI16 | 5058 |
MMA_X86_AVX512VNNI_1x16x2_I32_I8_CASTI16 |
| MMA_X86_AVX512VNNI_16x1x2_I32_I8_CASTI16 | 5059 |
MMA_X86_AVX512VNNI_16x1x2_I32_I8_CASTI16 |
| MMA_ARM_SVE_FMLA_1x4VLx1_F32_F32 | 8720 |
MMA_ARM_SVE_FMLA_1x4VLx1_F32_F32 |
| MMA_ARM_SVE_FMLA_4VLx1x1_F32_F32 | 8721 |
MMA_ARM_SVE_FMLA_4VLx1x1_F32_F32 |
| MMA_GENERIC_SCALAR_1x1x1_REG8 | 61448 |
MMA_GENERIC_SCALAR_1x1x1_REG8 |
| MMA_GENERIC_SCALAR_1x1x1_REG16 | 61456 |
MMA_GENERIC_SCALAR_1x1x1_REG16 |