Dialect/GPU
-iree-gpu-combine-barrier-regionslink
Combines iree_gpu.barrier_region ops
-iree-gpu-distribute-inner-tiled-to-laneslink
Distributes iree_codegen.inner_tiled ops to lanes
-iree-gpu-expand-undistributed-inner-tileslink
Expands the inner dimensions of iree_codegen.inner_tiled ops to match the thread layout
Optionslink
-expand-inputs : Expand the inner dimensions for the input operands of the inner_tiled ops.
-expand-outputs : Expand the inner dimensions for the output operands and results of the inner_tiled ops.
-iree-gpu-lower-opslink
Post bufferization lowerings of iree_gpu ops before late lowerings
-iree-gpu-unroll-to-intrinsicslink
Unrolls iree_gpu.multi_mma ops to their inner vector size.
-iree-gpu-vectorize-opslink
Vectorizes then lowers a few iree_gpu ops before vectorization.