A Few of Our Favorite Insights from Flink Forward 2016

The data Artisans team was very much impressed by this year’s Flink Forward speaker sessions, and the speakers delivered tons of detail on Apache Flink® use cases and benchmarks. Here, we’ll share just a small selection of our favorite insights from the presentations.

And remember, all speaker session recordings and slides are available on the Flink Forward website.

  1. Bouygues Telecom, one of the largest telecom networks in France, is running 30 production applications powered by Flink and is processing 10 billion raw events per day. As of Flink Forward 2015, they were live with 5 Flink applications, so we’re looking forward to hearing about their 180 Flink applications in 2017. (All Slides, Talkbouygues-slide-new-image Read more

September 2016 in Review: a fantastic Flink Forward, dA Platform debut, and Strata + Hadoop World NYC

Berlin’s surprise 32° September weather (90° F for those of you Stateside) has come and gone, and there was lots happening in the last few weeks of summer. Here are a few of the highlights.

Apache Flink® to the enterprise

In order to make Flink more accessible to organizations seeking enterprise support, data Artisans announced the dA Platform, a data Artisans-certified distribution of Flink bundled with 24x7x365 support. Get in touch with us if you’d like to learn more.

And we were thrilled to see that Lightbend included Flink in its Fast Data Platform. September was month of great progress in growing the Flink community and broadening the user base.

Read more

data Artisans at Strata + Hadoop World NYC 2016

From September 26-29, 2016 the big data community meet at Strata + Hadoop World in NYC. This year data Artisans will take part in several ways at the conference. For the first time we will have a booth at the conference (#P2), and will be demonstrating Apache Flink® and our brand new dA Platform. Stop by to connect with Apache Flink experts and learn more about implementing enterprise-grade streaming data applications in production.

Read more

Announcing the dA Platform, our distribution of Apache® Flink®

A team of original Apache Flink® contributors founded data Artisans in 2014 because we believed that existing data processing frameworks weren’t adequately addressing the needs of organizations and their engineering teams. From the global saturation of smartphones, to the rapid adoption of the Internet of Things and connected devices, the very nature of data and how it is generated had evolved far more quickly than the tools available to manage that data.

Read more

August 2016 in Review: Apache Flink® 1.1, Flink Forward announcements, and more

While most of the Continent was away on holiday, it was a productive August for data Artisans and for the Apache Flink® community. Here are highlights from the past month, and we can’t wait to see what the rest of 2016 has in store.

Apache Flink 1.1

There were many long-awaited features included in the Flink community’s 1.1 release, which was supported by 95 contributors. If you haven’t already, we recommend that you browse the release notes. Here are a few of the highlights:

Read more

data Artisans Globetrotters: September 2016 Edition

Say hello to members of our team in India, Germany, and the USA

Here’s a quick rundown of where you can find members of the data Artisans team in September 2016. If you want to see where we’ll be traveling throughout the rest of the year, check out our Events page.

We hope to get to meet many members of the Apache Flink® and stream processing communities in person this month.

  • VLDB (New Delhi, India), Sept 5-9: data Artisans software engineer Kostas Kloudas will be presenting an Apache Flink® training. Attendees will have a chance to get hands on with Flink during the session.
  • Strata + Hadoop World (New York City, USA), Sept 26-29: Kostas Tzoumas, Director of Applications Engineering Jamie Grier, and Product Manager Mike Winters will attend this year’s Strata + Hadoop World NYC. More to come about where you can find us inside the conference.

Apache Flink and Apache Kafka Streams

A comparison and guideline for users

This blog post is written jointly by Stephan Ewen, CTO of data Artisans, and Neha Narkhede, CTO of Confluent. You can also find this post at the Confluent blog.

The open source stream processing space is currently exploding, with more systems becoming available presenting users with many alternatives. In the Apache Software Foundation alone, there are now more than 10 stream processing projects, some in incubation and others graduated to top-level project status.

While the availability of alternatives benefits the industry and the users of these systems by enabling competition and thus, encouraging innovation, it can also be quite confusing: with all these options, which one is right for me both now and in the future? Stream processors can be evaluated on several dimensions, including performance (throughput and latency), integration with other systems, ease of use, fault tolerance guarantees, etc, but making such a comparison is not the topic of its post (and we are certainly biased).

For some time now, the Apache Kafka project has served as a common denominator in most open source stream processors as the the de-facto storage layer layer for storing and moving potentially large volumes of data in streaming fashion with low latency. Recently, the Kafka community introduced Kafka Streams, a stream processing library that ships as part of Apache Kafka. With the addition of Kafka Streams and Kafka Connect, Kafka has now added significant stream processing capabilities.

In this post, we focus on discussing how Flink and Kafka Streams compare with each other on stream processing, and we attempt to provide clarity on that question in this post. Flink and Kafka Streams were created with different use cases in mind. While they have some overlap in their applicability, they are designed to solve orthogonal problems and have very different sweet spots and placement in the data infrastructure stack.

Read more

Flink Forward 2016: Announcing keynotes and panel discussion

We are very excited to announce Ted Dunning as a keynote speaker for Flink Forward 2016! Ted is the VP of Incubator at Apache Software Foundation, the Chief Application Architect at MapR Technologies and a mentor on many recent projects. “How Can We Take Flink Forward?” will be presented on the second day of the conference.

Following Ted’s keynote, we’ll present a panel discussion on “Large Scale Streaming in Production“. As stream processing systems become more mainstream companies are looking to empower their users to take advantage of this technology. We welcome leading stream processing experts Xiaowei Jiang (Alibaba), Monal Daxini (Netflix. Inc), Maxim Fateev (Uber) and Ted Dunning (MapR Technologies) on stage to talk about the challenges they have faced and the solutions they have discovered while implementing stream processing systems at very large scales. The panel will be moderated by Jamie Grier (data Artisans).

The welcome keynote on Monday, September 12, will be given by data Artisans’ co-founders Kostas Tzoumas and Stephan Ewen. They will talk about “The maturing data streaming ecosystem and Apache Flink’s accelerated growth“. In this talk, Kostas and Stephan discuss several large-scale stream processing use cases that the data Artisans has seen over the past year.

Moreover, we are looking forward to Maxim Fateev’s talk “Beyond the Watermark: On-Demand Backfilling in Flink“. Flink’s time-progress model is built around a single watermark, which is incompatible with Uber’s business need for generating aggregates retroactively. Maxim’s talk covers Uber’s solution for on-demand backfilling.

Don’t miss the latest developments, best practices and use cases on Apache Flink. Register here: flink-forward.org/registration

Robust Stream Processing with Apache Flink®: A Simple Walkthrough

Jamie Grier, Director of Applications Engineering at data Artisans, gave an in-depth Apache Flink® demonstration at OSCON 2016 in Austin, TX. A recording is available on YouTube if you’d like to see the complete demo.

For our readers out there who are new to Apache Flink®, it’s worth repeating a simple yet powerful point: Flink enables stateful stream processing with production-grade reliability and accuracy guarantees. No longer should streaming applications be synonymous with estimates–imprecise systems that must be coupled with a batch processor to ensure reliability–but rather, robust and correct computations made in real-time.

Read more