I published this post at Hazelcast blog and put a copy here.
Stream processing is an emerging computational paradigm for on-the-fly processing of live data feeds, targeting low latency and high throughput. Streaming applications are usually deployed on multiple servers to achieve these requirements. Since even a single failure may lead to incorrect results or long interruptions in result delivery, fault tolerance is of paramount importance in such long-running distributed applications.
With the new 0.5 release, Hazelcast Jet capitalizes on the Hazelcast IMDG platform to simplify fault-tolerant execution of streaming computations. In this blog post, we briefly describe the new high availability and fault tolerance features of Hazelcast Jet.
Stream processing clusters contain components with heterogeneous roles, wherein a master node coordinates the execution of submitted jobs and possibly multiple worker nodes perform the actual execution. In this setting, a number of availability concerns arise for different components. For instance, a set of master nodes are deployed to eliminate single point of failure. One of these nodes is elected as leader and other nodes wait in standby until the failure of the leader. Relatedly, multiple worker nodes are run for parallel and distributed execution of the jobs. In case of a worker failure, its workload is re-assigned to a standby worker or distributed to other active workers.
In this model, stream processing engines generally depend on external coordination services, such as Apache ZooKeeper, for leader election, and replication of cluster and job information. Although coordination services are a good fit for such use cases, they bring additional complexity into the system. Hazelcast Jet frees the developers from the burden of configuring and maintaining coordination services for different environments and deployment settings. Hazelcast Jet does not depend on any external service and maintains job information in internal Hazelcast IMDG data structures, namely IMap instances. When a job is submitted, the job information is distributed between a set of IMap instances. Since the IMap data structure replicates data entries to multiple servers, Hazelcast Jet enables fault tolerance for job information. The presence of the master node is not visible to the end users. Users do not designate any Hazelcast Jet node for job coordination. Hazelcast Jet clusters coordinate execution of the submitted jobs transparently and automatically. Internally, a single Hazelcast Jet node, called coordinator, takes the responsibility of coordinating job executions. If the coordinator node fails for any reason, another node is automatically promoted. Relatedly, jobs are run on multiple nodes. If a node fails during execution, the job is automatically restarted on the remaining nodes.
So with these mechanisms, we transparently handle job execution and worker failover. But what about data processing guarantees?
Hazelcast Jet realizes fault tolerant stateful streaming computations with a new snapshotting mechanism. The new mechanism takes a snapshot of the state of a running streaming job at regular intervals. A snapshot contains a consistent view of the state of all processors at some point in time. In case of a node failure, the coordinator stops the current execution and restarts the job with the latest successful snapshot. When a job is restarted from a snapshot, the nodes recover processor state efficiently from local memory. The processors are initialized with the state recovered from the snapshot and the input streams are rolled back. The snapshotting feature is an efficient implementation of Lamport and Chandy’s “Distributed Snapshots: Determining Global States of a Distributed System” paper.
New methods were added to the Processor API for snapshotting. Upon a snapshot persistence call, a processor passes its internal state to the engine as key-value objects. Then, the processor state is put into internal IMap data structures as part of the snapshot. The new fault tolerance mechanism provides a choice between, none, exactly-once and at-least-once semantics for event processing. Source and sink processors must cooperate with the snapshotting mechanism in order to achieve the exactly-once semantics.
The 0.5 release of Hazelcast Jet opens new doors for highly available and fault tolerant streaming systems. Hazelcast Jet does not depend on any other external storage system for high availability and snapshotting. Additionally, in-memory snapshots pave the way for minimally disruptive snapshotting and fast recovery after failures. For the next release, we will be working on improvements and optimizations in the snapshotting mechanism.