People often ask us how Flink deals with backpressure effects. The answer is simple: Flink does not use any sophisticated mechanism, because it does not need one. It gracefully responds to backpressure by virtue of being a pure data streaming engine. In this blog post, we introduce the problem of backpressure. We then dig deeper on how Flink’s runtime ships data buffers between tasks and show how streaming data shipping naturally doubles down as a backpressure mechanism. We finally show this in action with a small experiment.
The evolution of fault-tolerant streaming architectures and their performance
The popularity of stream data platforms is skyrocketing. Several companies are transitioning parts of their data infrastructure to a streaming paradigm as a solution to increasing demands for real-time access to information. Infrastructures based on streaming data not only enable new types of latency-critical applications and give more actual operational insights through more up-to-date views of the processes; they also have the potential to make classical data warehousing setups radically more simple and flexible the same time.
A crucial piece of a streaming infrastructure is a stream processor that can deliver high throughput across a wide spectrum of latencies and strong consistency guarantees even in the presence of stateful computations. In recent articles, we introduced Apache Flink™ as a scalable stream processing engine that provides exactly this combination of properties.
In this article, we dig in deeper into how Flink’s novel checkpointing mechanism works, and how it supersedes older architectures for streaming fault tolerance and recovery. We measure the performance of Flink for various types of streaming applications and put it into perspective by running the same series of experiments on Apache Storm, a widely used low-latency stream processor.