Month: September 2015

Batch is a special case of streaming

Apache Flink™ and the Kappa architecture

Interested in stream processing? Sign up for Flink Forward 2015, the first conference on Apache Flink™.

In recent blog posts, we introduced what we deem as requirements for systems to classify as stream processors, and followed up with a detailed comparison of current approaches to data streaming, including extensive experiments comparing Apache Flink™ and Apache Storm.

We are not the only ones to make the point that streaming systems are reaching a point of maturity which makes older batch systems and architectures look less compelling. But is batch dead already? Not quite, it’s just that streaming is a proper superset of batch, and batch can be served equally well, if not better, with a new breed of modern streaming engines:


Read more

Kafka + Flink: A practical, how-to guide

A very common use case for Apache Flink™ is stream data movement and analytics. More often than not, the data streams are ingested from Apache Kafka, a system that provides durability and pub/sub functionality for data streams. Typical installations of Flink and Kafka start with event streams being pushed to Kafka, which are then consumed by Flink jobs. These jobs range from simple transformations for data import/export, to more complex applications that aggregate data in windows or implement CEP functionality. The results of these jobs may be actually fed back to Kafka for consumption by other services, written out to HDFS, other systems like Elasticsearch or user-facing web frontends.

Read more