Distributed Execution Model

Logical Function Instances

All functions types and instances run within the FunctionGroupOperator. Each parallel operator maintains a single instance for each function type and the state for all functions is stored in Flink’s state backend — typically backed by RocksDB or some other map implementation. When receiving a message for a function, the State Dispatcher looks up the function instance from the function type, and the state from the function type and logical ID. The operator then maps the state into the function instance and invokes the function with the message. Any updates that the function performs on the state are mapped back into the Flink state backend.

../_images/flink_function_multiplexing.png

Parallel Execution & Fault Tolerance

Apache Flink executes the dataflow in parallel by distributing the operator instances across different parallel processes (i.e. TaskManagers) and streaming events between them. State storage and failure recovery are backed by Flink’s state and fault tolerance mechanisms. However, Flink does not support fault tolerant loops out-of-the-box — the Stateful Functions implementation extends Flink’s native snapshot-based fault tolerance mechanism to support cyclic data flow graphs, following an approach similar to the one outlined in this paper.