Skip to content

Dialect/GPU

-iree-gpu-convert-forall-to-generic-nestlink

Converts scf.forall ops with GPU mapping to pcf.generic

Converts scf.forall ops with gpu.thread mapping to nested pcf.generic ops using subgroup (outer) and lane (inner) scopes.

-iree-gpu-distribute-inner-tiled-to-laneslink

Distributes iree_codegen.inner_tiled ops to lanes

-iree-gpu-expand-undistributed-inner-tileslink

Expands the inner dimensions of iree_codegen.inner_tiled ops to match the thread layout

Optionslink

-expand-inputs  : Expand the inner dimensions for the input operands of the inner_tiled ops.
-expand-outputs : Expand the inner dimensions for the output operands and results of the inner_tiled ops.

-iree-gpu-lower-opslink

Post bufferization lowerings of iree_gpu ops before late lowerings

-iree-gpu-unroll-to-intrinsicslink

Unrolls iree_gpu.multi_mma ops to their inner vector size.