LLVMGPU
-extract-address-computation-gpulink
Extract address computations from memory accesses
This pass is similar to extract-address-computation except it also
supports memory accesses that are specific to GPUs.
-iree-amdgpu-emulate-narrow-typelink
Emulate narrow integer operations including amdgpu operations
-iree-convert-to-nvvmlink
Perform final conversion from builtin/GPU/HAL/standard dialect to LLVM and NVVM dialects
-iree-convert-to-rocdllink
Perform final conversion from builtin/GPU/HAL/standard dialect to LLVM and ROCDL dialects
-iree-llvmgpu-assign-constant-ordinalslink
Assigns executable constant ordinals across all LLVMGPU variants.
-iree-llvmgpu-cast-address-space-functionlink
Cast address space to generic in CallOp and FuncOp
-iree-llvmgpu-cast-type-to-fit-mmalink
Perform type extension/truncation over vector.contract types to target GPU MMA intrinsics
-iree-llvmgpu-configure-tensor-layoutslink
Pass to set layouts on tensors for later vector distribution
-iree-llvmgpu-link-executableslink
Links LLVMGPU HAL executables within the top-level program module.
Optionslink
-target : Target backend name whose executables will be linked by this pass.
-iree-llvmgpu-lower-executable-targetlink
Perform lowering of executable target using one of the IREE::HAL::DispatchLoweringPassPipeline
Optionslink
-for-rocdl : Enable features only supported on ROCDL such as delaying lowering of subgroup reduce.
-iree-llvmgpu-pack-shared-memory-alloclink
Pass pack shared memory allocation in order to reduce memory usage.
-iree-llvmgpu-prefetch-shared-memorylink
Rotate scf.for loops to prefetch shared memory with distance 1. This pass is only applicableto ROCDL targets because its effectiveness on non-AMD GPUs lacks testing and evaluation.
-iree-llvmgpu-select-lowering-strategylink
Select a IREE::HAL::DispatchLoweringPassPipeline for lowering the target variant
-iree-llvmgpu-tensorcore-vectorizationlink
Pass to convert linalg into Vector and transform it to a form that can be lowered to GPU MMA ops
-iree-llvmgpu-tile-and-distributelink
Pass to tile and distribute linalg ops within a workgroup.
-iree-llvmgpu-vector-distributelink
Pass to distribute vectorized functions.
-iree-llvmgpu-vector-loweringlink
Pass to lower Vector ops before conversion to LLVM.
-iree-llvmgpu-vector-to-gpulink
Pass to convert vector to gpu.
-iree-test-llvmgpu-legalize-opslink
Test pass for several legalization patterns.