'flow' Dialectlink

A dialect designed to model execution data flow and partitioning.

The flow dialect is used to model regions of dense computation and the data flow between them. MLIR value-semantic tensors are used as the primary data type to allow SSA use-def to provide a bulk of the infrastructure required to perform the computation partitioning and outlining.

The dialect is designed to ingest relatively high-level linear algebra via XLA HLO ops (that also operate on the value-semantic tensor types) and optionally MLIR standard ops for control flow and other actions. After conversion of any higher-level ops that have special semantics in the flow dialect, such as global variables, the rest are partitioned into regions containing simple and compatible computations. Finally, outlining moves the computations into executables and leaves only the execution flow encoded via dispatch operations.

The primary unit of interest is a "dispatch region" containing compatible computations that can be scheduled together efficiently (and safely). "Compatible" here is specified as similarly shaped workloads that indicate how many invocations a computation can be parallelized across when running in a SPMD execution model. Though it depends on the particular runtime backends this more concretely means things like the untiled workload (or tiled workgroups) used in GPU dispatches or similar thread pool executors.

After identification of the dispatchable regions a set of transformations performs folding and simplification to reduce the total number of dispatches. Heuristics are used in certain cases to more efficiently schedule special ops (such as GEMM) and the design is amenable to profile- guided analysis that can be added in the future.

The resulting outlined executable modules containing the dispatchable code can be translated to one or more backends (such as SPIR-V for Vulkan, or LLVM IR for running on the CPU, etc). The IR that is outlined is untouched and in the input format (such as XLA HLO ops) allowing conversion using any MLIR target that supports ingesting such input. A few special ops are used to communicate statically available information such as the expected workload size, shapes of inputs and outputs, etc.

'flow' Dialect

Operationslink

Collective communication opslink

`flow.channel.count` (Flow::ChannelCountOp)link

Returns the total number of participants in the group

Syntax:

operation ::= `flow.channel.count` $channel `:` type($result)
              attr-dict-with-keyword

Returns the total participant count in the collective communicator group.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OpAsmOpInterface

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`channel`	a collecive communication channel

Results:link

Result	Description
`result`	index

`flow.channel.default` (Flow::ChannelDefaultOp)link

Returns a default collective communication channel

Syntax:

operation ::= `flow.channel.default` ($group^)?
              `:` type($result)
              attr-dict-with-keyword

Returns a channel initialized using the runtime environment.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OpAsmOpInterface

Effects: MemoryEffects::Effect{}

Attributes:link

Attribute	MLIR Type	Description
`group`	::mlir::StringAttr	string attribute

Results:link

Result	Description
`result`	a collecive communication channel

`flow.channel.rank` (Flow::ChannelRankOp)link

Returns the rank of the local participant in the group

Syntax:

operation ::= `flow.channel.rank` $channel `:` type($result)
              attr-dict-with-keyword

Returns the rank the channel represents as a participant in a collective group in [0, count).

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OpAsmOpInterface

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`channel`	a collecive communication channel

Results:link

Result	Description
`result`	index

`flow.channel.split` (Flow::ChannelSplitOp)link

Splits a collective communication channel

Syntax:

operation ::= `flow.channel.split` $channel `,` $color `,` $key
              `:` type($channel) `->` type($result)
              attr-dict-with-keyword

Partitions the group associated with the given channel into disjoint subgroups for each unique value of color. Each new subgroup contains all participants of the same color and within each subgroup the key argument is used to define the rank order. When multiple participants in a group use the same key the tie will be broken using their rank in the parent group.

Interfaces: InferTypeOpInterface, OpAsmOpInterface

Operands:link

Operand	Description
`channel`	a collecive communication channel
`color`	index
`key`	index

Results:link

Result	Description
`result`	a collecive communication channel

`flow.collective.all_gather` (Flow::CollectiveAllGatherOp)link

Performs all-gather operation

Syntax:

operation ::= `flow.collective.all_gather` $element_type `,` $target `,` $source `,` $channel `:`
              `(` type($target) `,` type($source) `,` type($channel) `)` `->`
              custom<ShapedTiedResult>(type($result), $target_dims, $tied_operands)
              attr-dict-with-keyword

It gathers data from all ranks and concatenates them on the 0-th dimension. Interfaces: InferTypeOpInterface, TiedOpInterface

Attributes:link

Attribute	MLIR Type	Description
`element_type`	::mlir::iree_compiler::IREE::Flow::CollectiveElementTypeAttr	valid CollectiveElementType
`tied_operands`	::mlir::ArrayAttr	64-bit integer array attribute

Operands:link

Operand	Description
`target`	ranked tensor of any type values
`target_dims`	variadic of index
`source`	ranked tensor of any type values
`channel`	a collecive communication channel

Results:link

Result	Description
`result`	ranked tensor of any type values

`flow.collective.all_reduce` (Flow::CollectiveAllReduceOp)link

Performs all-reduce operation

Syntax:

operation ::= `flow.collective.all_reduce` $reduction_op `,` $element_type `,` $target `,` $source `,` $channel `:`
              `(` type($target) `,` type($source) `,` type($channel) `)` `->`
              custom<ShapedTiedResult>(type($result), $target_dims, $tied_operands)
              attr-dict-with-keyword

The operation reduces data across all the ranks in the channel. Interfaces: InferTypeOpInterface, TiedOpInterface

Attributes:link

Attribute	MLIR Type	Description
`reduction_op`	mlir::iree_compiler::IREE::Flow::CollectiveReductionOpAttr	valid CollectiveReductionOp
`element_type`	::mlir::iree_compiler::IREE::Flow::CollectiveElementTypeAttr	valid CollectiveElementType
`tied_operands`	::mlir::ArrayAttr	64-bit integer array attribute

Operands:link

Operand	Description
`target`	ranked tensor of any type values
`target_dims`	variadic of index
`source`	ranked tensor of any type values
`channel`	a collecive communication channel

Results:link

Result	Description
`result`	ranked tensor of any type values

`flow.collective.all_to_all` (Flow::CollectiveAllToAllOp)link

Performs all-to-all operation

Syntax:

operation ::= `flow.collective.all_to_all` $element_type `,` $target `,` $source `,` $channel `:`
              `(` type($target) `,` type($source) `,` type($channel) `)` `->`
              custom<ShapedTiedResult>(type($result), $target_dims, $tied_operands)
              attr-dict-with-keyword

This operation mutually exchanges data acrosss all of the ranks in the channel. Interfaces: InferTypeOpInterface, TiedOpInterface

Attributes:link

Attribute	MLIR Type	Description
`element_type`	::mlir::iree_compiler::IREE::Flow::CollectiveElementTypeAttr	valid CollectiveElementType
`tied_operands`	::mlir::ArrayAttr	64-bit integer array attribute

Operands:link

Operand	Description
`target`	ranked tensor of any type values
`target_dims`	variadic of index
`source`	ranked tensor of any type values
`channel`	a collecive communication channel

Results:link

Result	Description
`result`	ranked tensor of any type values

`flow.collective.reduce_scatter` (Flow::CollectiveReduceScatterOp)link

Performs reduce and scatter operations

Syntax:

operation ::= `flow.collective.reduce_scatter` $reduction_op `,` $element_type `,` $target `,` $source `,` $channel `:`
              `(` type($target) `,` type($source) `,` type($channel) `)` `->`
              custom<ShapedTiedResult>(type($result), $target_dims, $tied_operands)
              attr-dict-with-keyword

The operation reduces data across all the ranks in the channel and scatters the result to each rank. Interfaces: InferTypeOpInterface, TiedOpInterface

Attributes:link

Attribute	MLIR Type	Description
`reduction_op`	mlir::iree_compiler::IREE::Flow::CollectiveReductionOpAttr	valid CollectiveReductionOp
`element_type`	::mlir::iree_compiler::IREE::Flow::CollectiveElementTypeAttr	valid CollectiveElementType
`tied_operands`	::mlir::ArrayAttr	64-bit integer array attribute

Operands:link

Operand	Description
`target`	ranked tensor of any type values
`target_dims`	variadic of index
`source`	ranked tensor of any type values
`channel`	a collecive communication channel

Results:link

Result	Description
`result`	ranked tensor of any type values

`flow.collective.send_recv` (Flow::CollectiveSendRecvOp)link

Performs a grouped send and receive operation

Syntax:

operation ::= `flow.collective.send_recv` $element_type `,` $target `,` $source `,` $channel `,` $send `,` $recv `:`
              `(` type($target) `,` type($source) `,` type($channel) `,` type($send) `,` type($recv) `)` `->`
              custom<ShapedTiedResult>(type($result), $target_dims, $tied_operands)
              attr-dict-with-keyword

The operation sends data to the rank specificied by send and receives data from the rank specified by recv. If send is -1, this rank will not send any data. If recv is -1, this rank will not receive any data and the output will be all zeros. Interfaces: InferTypeOpInterface, TiedOpInterface

Attributes:link

Attribute	MLIR Type	Description
`element_type`	::mlir::iree_compiler::IREE::Flow::CollectiveElementTypeAttr	valid CollectiveElementType
`tied_operands`	::mlir::ArrayAttr	64-bit integer array attribute

Operands:link

Operand	Description
`target`	ranked tensor of any type values
`target_dims`	variadic of index
`source`	ranked tensor of any type values
`channel`	a collecive communication channel
`send`	index
`recv`	index

Results:link

Result	Description
`result`	ranked tensor of any type values

Dispatch opslink

`flow.dispatch` (Flow::DispatchOp)link

A dispatch of workgroups across a grid

Syntax:

operation ::= `flow.dispatch` custom<DispatchEntryPoints>($entry_points)
              (`[` $workload^ `]`)? ``
              `(` $arguments `)` attr-dict `:`
              custom<ShapedFunctionType>(ref($arguments),
              type($arguments), $argument_dims,
              type($results), $result_dims,
              $tied_operands)

Dispatches workgroups across an grid defined by the captured workload parameters carrying the information required to compute the workgroup count at runtime. The function for converting the workload into a 3D workgroup count is attached to the dispatch entry point and may contain arbitrary host logic.

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), SymbolUserOpInterface, TiedOpInterface, Util_ShapeAwareOp

Effects: MemoryEffects::Effect{}

Attributes:link

Attribute	MLIR Type	Description
`entry_points`	::mlir::ArrayAttr	symbol ref array attribute
`tied_operands`	::mlir::ArrayAttr	64-bit integer array attribute

Operands:link

Operand	Description
`workload`	variadic of index
`arguments`	variadic of any type
`argument_dims`	variadic of index
`result_dims`	variadic of index

Results:link

Result	Description
`results`	variadic of any type

Executable opslink

Executables for outlined regions.

`flow.executable_end` (Flow::ExecutableEndOp)link

Terminator pseudo-op for the executable op

Syntax:

operation ::= `flow.executable_end` attr-dict

Traits: HasParent<IREE::Flow::ExecutableOp>, Terminator

`flow.executable.export` (Flow::ExecutableExportOp)link

Defines an executable entry point for dispatch operations

Syntax:

operation ::= `flow.executable.export` custom<SymbolVisibility>($sym_visibility)
              custom<SymbolAlias>($sym_name, $function_ref)
              custom<WorkgroupCountRegion>($workgroup_count)
              attr-dict-with-keyword

Specifies an exported function with an externally-visible alias. Multiple exports can reference the same internal function.

Each entry point can have a unique workgroup count calculation region. This region takes the workload parameters passed to each flow.dispatch and produces an XYZ workgroup count for the 3D grid dispatch.

Traits: HasParent<IREE::Flow::ExecutableOp>, IsolatedFromAbove

Interfaces: Symbol

Attributes:link

Attribute	MLIR Type	Description
`sym_visibility`	::mlir::StringAttr	string attribute
`sym_name`	::mlir::StringAttr	string attribute
`function_ref`	::mlir::FlatSymbolRefAttr	flat symbol reference attribute

`flow.executable` (Flow::ExecutableOp)link

Generic executable module

Syntax:

operation ::= `flow.executable` custom<SymbolVisibility>($sym_visibility)
              $sym_name
              attr-dict-with-keyword
              regions

An executable module containing one or more public functions. The contents of the functions are safe to dispatch and can be lowered further to target-specific backend IR representations.

Traits: IsolatedFromAbove, SingleBlockImplicitTerminator<IREE::Flow::ExecutableEndOp>, SingleBlock, SymbolTable, Util_ObjectLike

Interfaces: Symbol

Attributes:link

Attribute	MLIR Type	Description
`sym_visibility`	::mlir::StringAttr	string attribute
`sym_name`	::mlir::StringAttr	string attribute

Partitioned region opslink

`flow.dispatch.region` (Flow::DispatchRegionOp)link

A group of ops

This op is a container/grouping of ops. It represents a fusion group before being lowered to a dispatch region. Ops are collected inside of the region body of the op. Values from parent regions can be captured. Results are yielded with a return terminator and returned from this op.

dispatch.region ops are lowered to dispatch.workgroups ops. Workgroups isolated from above. dispatch.region ops are a more lightweight abstraction for implementing fusion heuristics, i.e., the process of deciding which ops should form a dispatch region.

This op also has a second region: workload_count. The arguments to the region represent the workload for the dispatch, and returns the number of workgroups for the dispatch. The region is lowered directly to workload_count region of dispatch.workgroups.

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), Util_ShapeAwareOp

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`result_dims`	variadic of index
`workload`	variadic of index

Results:link

Result	Description
`result`	variadic of any type

`flow.dispatch.tensor.load` (Flow::DispatchTensorLoadOp)link

Loads a tensor from a dispatch input placeholder

Syntax:

operation ::= `flow.dispatch.tensor.load` $source
              `,` `offsets` `=` custom<DynamicIndexList>(
              $offsets, $static_offsets)
              `,` `sizes` `=` custom<DynamicIndexList>(
              $sizes, $static_sizes)
              `,` `strides` `=` custom<DynamicIndexList>(
              $strides, $static_strides)
              attr-dict `:` type($source) (`{` $source_dims^ `}`)?  `->` type($result)

Loads an input tensor or subtensor from an input placeholder. As each workgroup executes concurrently all workgroups will receive identical loaded results of regions that may overlap.

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), OffsetSizeAndStrideOpInterface, ReifyRankedShapedTypeOpInterface, TiedOpInterface, Util_ShapeAwareOp

Effects: MemoryEffects::Effect{}

Attributes:link

Attribute	MLIR Type	Description
`static_offsets`	::mlir::DenseI64ArrayAttr	i64 dense array attribute
`static_sizes`	::mlir::DenseI64ArrayAttr	i64 dense array attribute
`static_strides`	::mlir::DenseI64ArrayAttr	i64 dense array attribute

Operands:link

Operand	Description
`source`	dispatch.tensor
`source_dims`	variadic of index
`offsets`	variadic of index
`sizes`	variadic of index
`strides`	variadic of index

Results:link

Result	Description
`result`	ranked tensor of any type values

`flow.dispatch.tensor.store` (Flow::DispatchTensorStoreOp)link

Stores a tensor into a dispatch output placeholder

Syntax:

operation ::= `flow.dispatch.tensor.store` $value `,` $target
              `,` `offsets` `=` custom<DynamicIndexList>(
              $offsets, $static_offsets)
              `,` `sizes` `=` custom<DynamicIndexList>(
              $sizes, $static_sizes)
              `,` `strides` `=` custom<DynamicIndexList>(
              $strides, $static_strides)
              attr-dict `:` type($value) `->` type($target) (`{` $target_dims^ `}`)?

Stores a tensor or subtensor into an output tensor placeholder. As each workgroup executes concurrently behavior is undefined if more than one workgroup stores into overlapping regions of the full output tensor.

Traits: AttrSizedOperandSegments

Interfaces: OffsetSizeAndStrideOpInterface, Util_ShapeAwareOp

Attributes:link

Attribute	MLIR Type	Description
`static_offsets`	::mlir::DenseI64ArrayAttr	i64 dense array attribute
`static_sizes`	::mlir::DenseI64ArrayAttr	i64 dense array attribute
`static_strides`	::mlir::DenseI64ArrayAttr	i64 dense array attribute

Operands:link

Operand	Description
`value`	ranked tensor of any type values
`target`	dispatch.tensor
`target_dims`	variadic of index
`offsets`	variadic of index
`sizes`	variadic of index
`strides`	variadic of index

`flow.dispatch.tie_shape` (Flow::DispatchTieShapeOp)link

Ties a runtime shape to a dispatch I/O argument

Syntax:

operation ::= `flow.dispatch.tie_shape` $operand attr-dict
              `:` type($result) (`{` $dynamic_dims^ `}`)?

Metadata op used to tie a runtime-computed shape with dynamic dimensions to a dispatch input/output argument. All uses of the argument should use the pass-through result of this op to allow for SSA-based shape resolution.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), ReifyRankedShapedTypeOpInterface, Util_ShapeAwareOp

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`operand`	dispatch.tensor
`dynamic_dims`	variadic of index

Results:link

Result	Description
`result`	dispatch.tensor

`flow.dispatch.workgroup.count` (Flow::DispatchWorkgroupCountOp)link

Returns the total workgroup count of the grid

Syntax:

operation ::= `flow.dispatch.workgroup.count` `[` $dimension `]` attr-dict `:` type($result)

The total number of workgroups along each dimension in the dispatch grid.

Represented as a 3D grid classically written as XYZ. Corresponds to the NumWorkgroups SPIR-V built-in and the gridDim CUDA built-in variable.

%x = flow.dispatch.workgroup.count[0] : index
%y = flow.dispatch.workgroup.count[1] : index
%z = flow.dispatch.workgroup.count[2] : index

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OpAsmOpInterface

Effects: MemoryEffects::Effect{}

Attributes:link

Attribute	MLIR Type	Description
`dimension`	::mlir::IntegerAttr	index attribute

Results:link

Result	Description
`result`	index

`flow.dispatch.workgroup.id` (Flow::DispatchWorkgroupIDOp)link

Returns the index of the current workgroup in the grid

Syntax:

operation ::= `flow.dispatch.workgroup.id` `[` $dimension `]` attr-dict `:` type($result)

The global workgroup ID of the current workgroup in the range of [0, flow.dispatch.workgroup.count) along each dimension.

Represented as a 3D grid classically written as XYZ. Corresponds to the WorkgroupId SPIR-V built-in and the blockIdx CUDA built-in variable.

%x = flow.dispatch.workgroup.id[0] : index
%y = flow.dispatch.workgroup.id[1] : index
%z = flow.dispatch.workgroup.id[2] : index

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OpAsmOpInterface

Effects: MemoryEffects::Effect{}

Attributes:link

Attribute	MLIR Type	Description
`dimension`	::mlir::IntegerAttr	index attribute

Results:link

Result	Description
`result`	index

`flow.dispatch.workgroup.size` (Flow::DispatchWorkgroupSizeOp)link

Returns the size of each workgroup in invocations

Syntax:

operation ::= `flow.dispatch.workgroup.size` `[` $dimension `]` attr-dict `:` type($result)

The number of local invocations within the current workgroup along each dimension. Depending on backend this may map to the SIMT thread count or inner loop nest parameters.

Workgroup sizes are not determined at the flow dialect level as they are dependent on the target backend determined when lowering into the HAL. It's still possible to use the symbolic workgroup size inside of dispatch executables as a placeholder for the resolved value once in the HAL.

Represented as a 3D grid classically written as XYZ. Corresponds to the WorkgroupSize SPIR-V built-in and the blockDim CUDA built-in variable.

%x = flow.dispatch.workgroup.size[0] : index
%y = flow.dispatch.workgroup.size[1] : index
%z = flow.dispatch.workgroup.size[2] : index

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OpAsmOpInterface

Effects: MemoryEffects::Effect{}

Attributes:link

Attribute	MLIR Type	Description
`dimension`	::mlir::IntegerAttr	index attribute

Results:link

Result	Description
`result`	index

`flow.dispatch.workgroups` (Flow::DispatchWorkgroupsOp)link

A dispatch of workgroups across a 3-dimensional grid

Syntax:

operation ::= `flow.dispatch.workgroups` (`[` $workload^ `]`)? ``
              `(` $arguments `)` `:`
              custom<ShapedFunctionType>(ref($arguments),
              type($arguments), $argument_dims,
              type($results), $result_dims,
              $tied_operands)
              attr-dict-with-keyword
              `=` `\n` ` ` ` ` ` `
              custom<DispatchWorkgroupBody>(ref(type($arguments)),
              ref(type($results)),
              $workgroup_body)
              `` custom<DispatchWorkgroupsCountRegion>($workgroup_count)

Dispatches some number of workgroups across a 3-dimensional grid. The body region will be invoked for each workgroup with a unique flow.dispatch.workgroup.id in the range of [0, flow.dispatch.workgroup.count) (along each dimension XYZ).

From the outside the dispatch operation has value semantics: some tensors (and optionally other primitive types) are consumed and one or more new result tensors are produced. Inside each workgroup, however, the input and output tensors are available for arbitrary loads and stores. In many cases each workgroup will load some particular tile(s) from the input tensors and store some particular tile(s) to the output tensors unique to that workgroup. Though it's possible for multiple workgroups to load the same regions of the input tensors behavior is undefined if multiple workgroups store to the same regions of the output tensors.

Though the representation is similar to the GPU-style grid dispatch model here we still have not yet allocated buffers, determined the target device for execution, or even completed fully resolving shapes/types/etc. Because of this it's important that the workgroup body use the flow.dispatch.workgroup.* ops to query the workgroup ID/count/size instead of hardcoding them to a particular set of values. Assume that any workgroup dispatch may end up being specialized for several different target devices and even several different variants for a particular target device (differing workgroup sizes, etc).

Because at this point in the layering devices have not yet been selected the workgroup count cannot be fully evaluated. Instead workload parameters are captured that are then passed to a function that when later evaluated computes the actual workgroup count based on target information. The workload is not limited to the 3D XYZ grid dispatch of the workgroup count and can contain any number of parameters used to compute it.

%r = flow.dispatch.workgroups[%c5, %c5](%0, %1)
    : (tensor<5x5xf32>, tensor<5xf32>) -> tensor<5x5xf32> =
          (%arg0: !flow.dispatch.tensor<readonly:tensor<5x5xf32>>,
           %arg1: !flow.dispatch.tensor<readonly:tensor<5xf32>>,
           %arg2: !flow.dispatch.tensor<writeonly:tensor<5x5xf32>>) {
  ...
}

The number of results of the operation is equal to the number of results in the type signature ((tensor<5x5xf32>, tensor<5xf32>) -> tensor<5x5xf32>). Each tensor argument and result in the type signature has a corresponding block argument of type !flow.dispatch.tensor. Furthermore, each argument has a corresponding arguments operand.

There are no arguments operands for results, but a result can be tied an argument by writing the argument operand's SSA value instead of its type: E.g., in the above example, -> %0 would tie the first argument to the result. In that case, there would be no separate block argument for the result.

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments, IsolatedFromAbove

Interfaces: ClosureOpInterface, ConditionallySpeculatable, HoistableOpInterface, NoMemoryEffect (MemoryEffectOpInterface), TiedOpInterface, Util_ShapeAwareOp

Effects: MemoryEffects::Effect{}

Attributes:link

Attribute	MLIR Type	Description
`tied_operands`	::mlir::ArrayAttr	64-bit integer array attribute

Operands:link

Operand	Description
`workload`	variadic of index
`arguments`	variadic of any type
`argument_dims`	variadic of index
`result_dims`	variadic of index

Results:link

Result	Description
`results`	variadic of any type

`flow.return` (Flow::ReturnOp)link

Return from a flow.dispatch_region

Syntax:

operation ::= `flow.return` attr-dict ($operands^ `:` type($operands))?

Returns the given values from the region and back to the host code.

Traits: AlwaysSpeculatableImplTrait, ReturnLike, Terminator

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), RegionBranchTerminatorOpInterface

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`operands`	variadic of any type

Streamable call opslink

`flow.call` (Flow::CallOp)link

Calls a streamable external host function

Syntax:

operation ::= `flow.call` $callee
              `(` $arguments `)` attr-dict `:`
              custom<ShapedFunctionType>(ref($arguments),
              type($arguments), $argument_dims,
              type($results), $result_dims,
              $tied_operands)

Calls a function taking/returning tensor values with stream semantics. Tensors have their shapes captured and may be tied to denote in-place operations. Asynchronous calls must have no side-effects.

Note that returned tensors must have their shapes declared prior to the call as this is what allows the call to be made on the stream. If external host logic is required to compute the shape (avoid at all costs!) a separate func.call can be used outside of the stream to do so. If shapes are unknowable until the operation is performed it should be made as a normal asynchronous host call with 'coarse-fences' instead.

Traits: AttrSizedOperandSegments

Interfaces: CallOpInterface, SymbolUserOpInterface, TiedOpInterface, Util_ShapeAwareOp

Attributes:link

Attribute	MLIR Type	Description
`callee`	::mlir::FlatSymbolRefAttr	flat symbol reference attribute
`tied_operands`	::mlir::ArrayAttr	64-bit integer array attribute

Operands:link

Operand	Description
`arguments`	variadic of any type
`argument_dims`	variadic of index
`result_dims`	variadic of index

Results:link

Result	Description
`results`	variadic of any type

`flow.func` (Flow::FuncOp)link

Streamable function declaration

Syntax:

operation ::= `flow.func` custom<SymbolVisibility>($sym_visibility)
              $sym_name
              ``
              custom<ShapedFunctionSignature>($function_type,
              $tied_operands,
              $arg_attrs,
              $res_attrs)
              attr-dict-with-keyword
              ($body^)?

Declares a function that can be called as an asynchronous streaming operation via flow.call. Today only external functions are allowed.

Traits: IsolatedFromAbove

Interfaces: CallableOpInterface, FunctionOpInterface, Symbol

Attributes:link

Attribute	MLIR Type	Description
`sym_name`	::mlir::StringAttr	string attribute
`function_type`	::mlir::TypeAttr	type attribute of function type
`tied_operands`	::mlir::ArrayAttr	64-bit integer array attribute
`sym_visibility`	::mlir::StringAttr	string attribute
`arg_attrs`	::mlir::ArrayAttr	Array of dictionary attributes
`res_attrs`	::mlir::ArrayAttr	Array of dictionary attributes

Tensor opslink

`flow.dispatch.workgroup_count_from_dag_root` (Flow::DispatchWorkgroupCountFromDagRootOp)link

Workgroup count computed based on iteration range of the root of the DAG for ops within the dispatch.

Syntax:

operation ::= `flow.dispatch.workgroup_count_from_dag_root` attr-dict $operands

When using tile + distribution of the root of the DAG (Directed Acyclic Graph) of ops within the dispatch to split the work amongst workgroups. The workload captured is the size of the iteration space of the root of the DAG. This op represents the computation that given the workload returns the number of workgroups to use. The backends are responsible for lowering this op into actual computation (typically based on the tile sizes used to tile and distribute the root of the DAG).

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`operands`	variadic of index

Results:link

Result	Description
`x`	index
`y`	index
`z`	index

`flow.dispatch.workgroup_count_from_slice` (Flow::DispatchWorkgroupCountFromSliceOp)link

Place holder to signify default workgroup count calculation.

Syntax:

operation ::= `flow.dispatch.workgroup_count_from_slice` attr-dict $operands

The default computation of the number of workgroups (or workgroup count) assumes that the dispatch + captured values is enough to compute the workgroup count. It does so by using a program slice of the values within the dispatch that represent the number of workgroups when available within the dispatch. Currently the arguments of index types captured by the flow.dispatch.workgroups is treated as the workload for the operation. It is a requirement that the slice of the program that computes the number of workgroups will need to have its leaves be these captured values.

TODO: This could be generalized in future to allow the slices to encompass arbitrary computation. The computation of the workgroup count can then be done on the device itself, if this is data dependent. In such cases the workload could be more than just values of index types.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`operands`	variadic of index

Results:link

Result	Description
`x`	index
`y`	index
`z`	index

`flow.dispatch.workload.ordinal` (Flow::DispatchWorkloadOrdinalOp)link

Annotates the values captured as workload within the body of flow.dispatch.workgroups op.

Syntax:

operation ::= `flow.dispatch.workload.ordinal` attr-dict $operand `,` $ordinal `:` type($operand)

The arguments that represent the captured/returned values of the `flow.dispatch.workgroups, i.e. the signature of the body of the op is not preserved during IREEs compilation. Since the workloads are derived from the operands captured by the operation, this op denotes the values captured as workloads. This can be used in the backends to map back to the workload values while materializing the workgroup count computation.

TODO: Find a better way to represent this information, either by somehow propagating the signature of the created dispatch workgroup op through the compilation stack until the codegen backends, or as a separate list/attribute that can be plumbed through without using explicit ops.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:link

Attribute	MLIR Type	Description
`ordinal`	::mlir::IntegerAttr	index attribute

Operands:link

Operand	Description
`operand`	index

Results:link

Result	Description
`result`	index

`flow.tensor.alloca` (Flow::TensorAllocaOp)link

An empty tensor allocation with undefined contents

Syntax:

operation ::= `flow.tensor.alloca` `:` type($result) (`{` $result_dims^ `}`)?
              attr-dict-with-keyword

Returns a new transient tensor allocation with undefined contents. Subsequent writes must populate any ranges of the tensor that are later read. The resulting tensor may be long-lived and allocated as part of a dedicated allocation. Prefer using flow.tensor.empty whenever possible as this op disables nearly all allocation-related optimizations performed by the compiler. The presence of this op is often an indication of an improper lowering.

Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface), Util_ShapeAwareOp

Effects: MemoryEffects::Effect{MemoryEffects::Allocate on ::mlir::SideEffects::DefaultResource}

Operands:link

Operand	Description
`result_dims`	variadic of index

Results:link

Result	Description
`result`	ranked tensor of any type values

`flow.tensor.bitcast` (Flow::TensorBitCastOp)link

Bitcasts a tensor

Syntax:

operation ::= `flow.tensor.bitcast` $source `:`
              type($source) (`{` $source_dims^ `}`)? `->`
              type($result) (`{` $result_dims^ `}`)?
              attr-dict-with-keyword

Bitcasts a tensor to a new type without modifying the contents.

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ConditionallySpeculatable, HoistableOpInterface, NoMemoryEffect (MemoryEffectOpInterface), TiedOpInterface, Util_ShapeAwareOp

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`source`	ranked tensor of any type values
`source_dims`	variadic of index
`result_dims`	variadic of index

Results:link

Result	Description
`result`	ranked tensor of any type values

`flow.tensor.clone` (Flow::TensorCloneOp)link

Performs a full tensor clone operation

Syntax:

operation ::= `flow.tensor.clone` $operand `:` type($result) (`{` $argument_dims^ `}`)?
              attr-dict-with-keyword

Clones the input tensor into an identical output tensor.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, HoistableOpInterface, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), Util_ShapeAwareOp

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`operand`	ranked tensor of any type values
`argument_dims`	variadic of index

Results:link

Result	Description
`result`	ranked tensor of any type values

`flow.tensor.constant` (Flow::TensorConstantOp)link

Tensor constant that can have dynamic dimensions

Syntax:

operation ::= `flow.tensor.constant` attr-dict $value

Allows specifying a tensor constant of IREE-specific types/attributes.

%cst = flow.tensor.constant #something_tensor_like : tensor<2x2xf32>
%res = math.absf %cst : tensor<2x2xf32>

Traits: AlwaysSpeculatableImplTrait, ConstantLike

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:link

Attribute	MLIR Type	Description
`value`	::mlir::TypedAttr	TypedAttr instance

Results:link

Result	Description
`result`	tensor of any type values

`flow.tensor.dynamic_constant` (Flow::TensorDynamicConstantOp)link

Tensor constant that can have dynamic dimensions

Syntax:

operation ::= `flow.tensor.dynamic_constant` attr-dict $value `->` type($result)

Allows specifying a tensor constant of IREE-specific types/attributes with a dynamic shape that approximates a value as passed from the user. This disables many optimizations and should only be used when testing or benchmarking and wanting to ensure that dynamic dimension behavior is preserved.

%cst = flow.tensor.dynamic_constant #something_tensor_like : tensor<2x2xf32> -> tensor<?x2xf32>
%res = math.absf %cst : tensor<?x2xf32>

Attributes:link

Attribute	MLIR Type	Description
`value`	::mlir::TypedAttr	TypedAttr instance

Results:link

Result	Description
`result`	tensor of any type values

`flow.tensor.empty` (Flow::TensorEmptyOp)link

An empty tensor carrying metadata but no contents

Syntax:

operation ::= `flow.tensor.empty` `:` type($result) (`{` $result_dims^ `}`)?
              attr-dict-with-keyword

Returns a tensor with undefined contents. Subsequent writes must populate any ranges of the tensor that are later read.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, HoistableOpInterface, NoMemoryEffect (MemoryEffectOpInterface), Util_ShapeAwareOp

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`result_dims`	variadic of index

Results:link

Result	Description
`result`	ranked tensor of any type values

`flow.tensor.load` (Flow::TensorLoadOp)link

Loads a value from a tensor element

Syntax:

operation ::= `flow.tensor.load` $source (`[` $indices^ `]`)? `:`
              type($source) (`{` $source_dims^ `}`)?
              attr-dict-with-keyword

Returns the element at the given location from within the tensor.

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), Util_ShapeAwareOp

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`source`	ranked tensor of any type values
`source_dims`	variadic of index
`indices`	variadic of index

Results:link

Result	Description
`result`	index or signless integer or floating-point or complex-type or vector of any type values

`flow.tensor.reshape` (Flow::TensorReshapeOp)link

Reshapes a tensor

Syntax:

operation ::= `flow.tensor.reshape` $source `:`
              type($source) (`{` $source_dims^ `}`)? `->`
              type($result) (`{` $result_dims^ `}`)?
              attr-dict-with-keyword

Reshapes a tensor to a new shape without modifying the contents.

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ConditionallySpeculatable, HoistableOpInterface, NoMemoryEffect (MemoryEffectOpInterface), TiedOpInterface, Util_ShapeAwareOp

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`source`	ranked tensor of any type values
`source_dims`	variadic of index
`result_dims`	variadic of index

Results:link

Result	Description
`result`	ranked tensor of any type values

`flow.tensor.slice` (Flow::TensorSliceOp)link

Slices out a subregion of a tensor

Syntax:

operation ::= `flow.tensor.slice` $source `[` $start_indices `for` $lengths `]` `:`
              type($source) (`{` $source_dims^ `}`)? `->`
              type($result) (`{` $result_dims^ `}`)?
              attr-dict-with-keyword

Clones a subregion of a tensor.

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ConditionallySpeculatable, HoistableOpInterface, NoMemoryEffect (MemoryEffectOpInterface), Util_ShapeAwareOp

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`source`	ranked tensor of any type values
`source_dims`	variadic of index
`start_indices`	variadic of index
`lengths`	variadic of index
`result_dims`	variadic of index

Results:link

Result	Description
`result`	ranked tensor of any type values

`flow.tensor.splat` (Flow::TensorSplatOp)link

Splats a value into a shaped tensor

Syntax:

operation ::= `flow.tensor.splat` $value `:` type($result) (`{` $result_dims^ `}`)?
              attr-dict-with-keyword

Returns a tensor initialized to the given primitive value.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, HoistableOpInterface, NoMemoryEffect (MemoryEffectOpInterface), Util_ShapeAwareOp

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`value`	index or signless integer or floating-point or complex-type
`result_dims`	variadic of index

Results:link

Result	Description
`result`	ranked tensor of any type values

`flow.tensor.store` (Flow::TensorStoreOp)link

Stores a value into a tensor element

Syntax:

operation ::= `flow.tensor.store` $value `,` $target (`[` $indices^ `]`)? `:`
              type($target) (`{` $target_dims^ `}`)?
              attr-dict-with-keyword

Returns a tensor with the element at the given index set to the given value.

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), Util_ShapeAwareOp

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`value`	index or signless integer or floating-point or complex-type or vector of any type values
`target`	ranked tensor of any type values
`target_dims`	variadic of index
`indices`	variadic of index

Results:link

Result	Description
`result`	ranked tensor of any type values

`flow.tensor.tie_shape` (Flow::TensorTieShapeOp)link

Ties a runtime shape to a tensor value

Syntax:

operation ::= `flow.tensor.tie_shape` $operand attr-dict
              `:` type($result) (`{` $dynamic_dims^ `}`)?

Metadata op used to tie tensors with their runtime-computed dynamic dimensions. This only exists transiently in the IR as a witness to shape calculations and is removed during lowering.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), ReifyRankedShapedTypeOpInterface, Util_ShapeAwareOp

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`operand`	ranked tensor of any type values
`dynamic_dims`	variadic of index

Results:link

Result	Description
`result`	ranked tensor of any type values

`flow.tensor.trace` (Flow::TensorTraceOp)link

Traces one or more tensor values at runtime

Syntax:

operation ::= `flow.tensor.trace` $key `=` `[`
              custom<ShapedOperandList>($values, type($values), $value_dims)
              `]` attr-dict-with-keyword

Traces out to a runtime trace sink (console, log file, etc) the given tensors. The key is arbitrary and can be used for identifying the set of values being traced.

Traits: AttrSizedOperandSegments

Interfaces: ShapeAwareOpInterface

Attributes:link

Attribute	MLIR Type	Description
`key`	::mlir::StringAttr	string attribute

Operands:link

Operand	Description
`values`	variadic of ranked tensor of any type values
`value_dims`	variadic of index

`flow.tensor.update` (Flow::TensorUpdateOp)link

Updates a tensor with the contents of another tensor

Syntax:

operation ::= `flow.tensor.update` $update `,` $target `[` $start_indices `]` `:`
              type($update) (`{` $update_dims^ `}`)? `->`
              custom<ShapedTiedResult>(type($result), $target_dims)
              attr-dict-with-keyword

Updates the target tensor with the contents of the update tensor at the given offset indices.

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ConditionallySpeculatable, HoistableOpInterface, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), TiedOpInterface, Util_ShapeAwareOp

Effects: MemoryEffects::Effect{}

Operands:link

Operand	Description
`target`	ranked tensor of any type values
`target_dims`	variadic of index
`start_indices`	variadic of index
`update`	ranked tensor of any type values
`update_dims`	variadic of index

Results:link

Result	Description
`result`	ranked tensor of any type values

Attributeslink

DummyAttrlink

Syntax: #flow.dummy

NamedParameterAttrlink

named parameter referenced an optional scope and key

Syntax:

#flow.parameter.named<
  ::mlir::Type,   # type
  StringAttr,   # scope
  StringAttr,   # key
  DictionaryAttr   # config
>

Species an externally-defined parameter that can be referenced by an optional scope defining a set of parameters and a key uniquely identifying the parameter within its scope.

Parameters:link

Parameter	C++ type	Description
type	`::mlir::Type`
scope	`StringAttr`
key	`StringAttr`
config	`DictionaryAttr`

Type constraintslink

dispatch.tensorlink

A placeholder for a dispatch region input/output operand. This can be used to query the metadata about the tensor (such as its shape) as well as both load and store from the backing tensor representation.

dispatch.tensorlink

A placeholder for a dispatch region input operand. This can be used to query the metadata about the tensor (such as its shape) as well as load from the backing tensor representation.

dispatch.tensorlink

A placeholder for a dispatch region output operand. This can be used to query the metadata about the tensor (such as its shape) as well as store to the backing tensor representation.

Typeslink

ChannelTypelink

a collecive communication channel

Syntax: !flow.channel

Represents a single participant in a collective clique. Multiple channels may exist within the same program to allow for partial operations or hierarchical operations.

In programs that have already been partitioned prior to being compiled there will often exist only one channel and flow.channel.default can be used to reference it. In programs that model SPMD behavior internally channels can be created or provided by hosting applications.

DummyTypelink

Syntax: !flow.dummy

Enumslink

CollectiveElementTypelink

valid CollectiveElementType

Cases:link

Symbol	Value	String
Sint8	`0`	si8
Uint8	`1`	ui8
Sint16	`2`	si16
Uint16	`3`	ui16
Sint32	`4`	si32
Uint32	`5`	ui32
Sint64	`6`	si64
Uint64	`7`	ui64
Float16	`8`	f16
Float32	`9`	f32
Float64	`10`	f64
BFloat16	`11`	bf16

CollectiveReductionOplink

valid CollectiveReductionOp

Cases:link

Symbol	Value	String
None	`0`	none
ReductionSum	`1`	sum
ReductionProduct	`2`	product
ReductionMinimum	`3`	minimum
ReductionMaximum	`4`	maximum
ReductionAverage	`5`	average

'flow' Dialectlink

Operationslink

Collective communication opslink

flow.channel.count (Flow::ChannelCountOp)link

Operands:link

Results:link

flow.channel.default (Flow::ChannelDefaultOp)link

Attributes:link

Results:link

flow.channel.rank (Flow::ChannelRankOp)link

Operands:link

Results:link

flow.channel.split (Flow::ChannelSplitOp)link

Operands:link

Results:link

flow.collective.all_gather (Flow::CollectiveAllGatherOp)link

Attributes:link

Operands:link

Results:link

flow.collective.all_reduce (Flow::CollectiveAllReduceOp)link

Attributes:link

Operands:link

Results:link

flow.collective.all_to_all (Flow::CollectiveAllToAllOp)link

Attributes:link

Operands:link

Results:link

flow.collective.reduce_scatter (Flow::CollectiveReduceScatterOp)link

Attributes:link

Operands:link

Results:link

flow.collective.send_recv (Flow::CollectiveSendRecvOp)link

Attributes:link

Operands:link

Results:link

Dispatch opslink

flow.dispatch (Flow::DispatchOp)link

Attributes:link

Operands:link

Results:link

Executable opslink

flow.executable_end (Flow::ExecutableEndOp)link

flow.executable.export (Flow::ExecutableExportOp)link

Attributes:link

flow.executable (Flow::ExecutableOp)link

Attributes:link

Partitioned region opslink

flow.dispatch.region (Flow::DispatchRegionOp)link

Operands:link

Results:link

flow.dispatch.tensor.load (Flow::DispatchTensorLoadOp)link

Attributes:link

Operands:link

Results:link

flow.dispatch.tensor.store (Flow::DispatchTensorStoreOp)link

Attributes:link

Operands:link

flow.dispatch.tie_shape (Flow::DispatchTieShapeOp)link

Operands:link

Results:link

flow.dispatch.workgroup.count (Flow::DispatchWorkgroupCountOp)link

Attributes:link

Results:link

flow.dispatch.workgroup.id (Flow::DispatchWorkgroupIDOp)link

Attributes:link

Results:link

flow.dispatch.workgroup.size (Flow::DispatchWorkgroupSizeOp)link

Attributes:link

Results:link

flow.dispatch.workgroups (Flow::DispatchWorkgroupsOp)link

Attributes:link

Operands:link

Results:link

flow.return (Flow::ReturnOp)link

Operands:link

Streamable call opslink

flow.call (Flow::CallOp)link

Attributes:link

Operands:link

Results:link

`flow.channel.count` (Flow::ChannelCountOp)link

`flow.channel.default` (Flow::ChannelDefaultOp)link

`flow.channel.rank` (Flow::ChannelRankOp)link

`flow.channel.split` (Flow::ChannelSplitOp)link

`flow.collective.all_gather` (Flow::CollectiveAllGatherOp)link

`flow.collective.all_reduce` (Flow::CollectiveAllReduceOp)link

`flow.collective.all_to_all` (Flow::CollectiveAllToAllOp)link

`flow.collective.reduce_scatter` (Flow::CollectiveReduceScatterOp)link

`flow.collective.send_recv` (Flow::CollectiveSendRecvOp)link

`flow.dispatch` (Flow::DispatchOp)link

`flow.executable_end` (Flow::ExecutableEndOp)link

`flow.executable.export` (Flow::ExecutableExportOp)link

`flow.executable` (Flow::ExecutableOp)link

`flow.dispatch.region` (Flow::DispatchRegionOp)link

`flow.dispatch.tensor.load` (Flow::DispatchTensorLoadOp)link

`flow.dispatch.tensor.store` (Flow::DispatchTensorStoreOp)link

`flow.dispatch.tie_shape` (Flow::DispatchTieShapeOp)link

`flow.dispatch.workgroup.count` (Flow::DispatchWorkgroupCountOp)link

`flow.dispatch.workgroup.id` (Flow::DispatchWorkgroupIDOp)link

`flow.dispatch.workgroup.size` (Flow::DispatchWorkgroupSizeOp)link

`flow.dispatch.workgroups` (Flow::DispatchWorkgroupsOp)link

`flow.return` (Flow::ReturnOp)link

`flow.call` (Flow::CallOp)link

`flow.func` (Flow::FuncOp)link

`flow.dispatch.workgroup_count_from_dag_root` (Flow::DispatchWorkgroupCountFromDagRootOp)link

`flow.dispatch.workgroup_count_from_slice` (Flow::DispatchWorkgroupCountFromSliceOp)link

`flow.dispatch.workload.ordinal` (Flow::DispatchWorkloadOrdinalOp)link

`flow.tensor.alloca` (Flow::TensorAllocaOp)link

`flow.tensor.bitcast` (Flow::TensorBitCastOp)link

`flow.tensor.clone` (Flow::TensorCloneOp)link

`flow.tensor.constant` (Flow::TensorConstantOp)link

`flow.tensor.dynamic_constant` (Flow::TensorDynamicConstantOp)link

`flow.tensor.empty` (Flow::TensorEmptyOp)link

`flow.tensor.load` (Flow::TensorLoadOp)link

`flow.tensor.reshape` (Flow::TensorReshapeOp)link

`flow.tensor.slice` (Flow::TensorSliceOp)link

`flow.tensor.splat` (Flow::TensorSplatOp)link

`flow.tensor.store` (Flow::TensorStoreOp)link

`flow.tensor.tie_shape` (Flow::TensorTieShapeOp)link

`flow.tensor.trace` (Flow::TensorTraceOp)link

`flow.tensor.update` (Flow::TensorUpdateOp)link