Apache Flink® User Survey 2016 Results, Part 2
Posted on Jan 24th, 2017 by Michael Winters
Last week, we published the first of two blog posts recapping the results of the 2016 Apache Flink® user survey. In part 1, we shared a selection of graphs summarizing responses to the survey’s multiple choice questions. In part 2, we’ll look at responses to the survey’s open-ended questions:
- What new features or functionality would you like to see in Flink?
- Please briefly describe the application(s) your team is building or plans to build with Flink.
- Are there any other challenges (when working with Flink) not listed (in the previous question) that you’d like to mention?
- What other sources / sinks not included in the list (that was provided in the survey), if any, are important for your Flink application?
- We welcome any final comments about any aspect of Flink.
Some background information: one of the questions in the survey asked respondents if it was okay to share anonymous and unattributed responses or quotes in the post-survey recap, and 67 of 119 respondents gave permission to share their responses.
The answers in this post come from more than half, but not all, of the survey-takers. And we won’t be including any company or personal information in these responses–we removed information from open-ended responses that we thought might reveal who provided it.
In this post, we’ll include a subset of open-ended responses that are indicative of patterns or common themes, and if you want to see all responses from respondents who gave their permission to share, we’ve created a GitHub repo that contains all survey-related materials.
In some cases, we edited responses for readability but otherwise left them untouched, so what you’re seeing is raw feedback from Flink users.
1. What new features or functionality would you like to see in Flink?
2. Please briefly describe the application(s) your team is building or plans to build with Flink.
3. Are there any other challenges (with Flink) not listed above that you’d like to mention?
4. What other sources / sinks not included in the list above, if any, are important for your Flink application?
5. We welcome any final comments about any aspect of Flink.
What new features or functionality would you like to see in Flink?
"Run distributed jobs without a master. So you can deploy to an auto-scaling fleet (like Kubernetes) based on current load." "The ability to manage large (>1TB) of state is most important to me. I think incremental checkpointing and recovery would make this feasible." "Dynamic scaling on YARN, exactly-once Kafka output" "Full support for Mesos (in general, we would like to see a more flexible scheduling framework), generic mechanism to serialize data from job manager to task manager, support for case classes (scala) in keyBy stream partitioning, built-in unit testing framework, dynamic auto-scaling based on workload criteria, upgrades to jobs with no downtime (rolling or red-blue)" "Dynamic scaling, Mesos support, Cassandra state backend, also more user-friendly support for Scala API." "Side inputs, improved iteration support for streaming, deterministic sampling, static vs. dynamic path in iterations (FLINK-2396)" "Support of more data formats as keys and for aggregations; improvement of documentation i.e. indexing and self-defined aggregation functions" "Side inputs; full SQL / Calcite support on streams, full DataStream + data table mix, keyed local state available across task boundaries" "Better ML Support" "Incremental checkpoints, automatic savepoints" "Not only incremental checkpointing but also incremental restore; API to query state" "Training/certification ecosystem is missing" "Programatic creation and management of Flink jobs."
Please briefly describe the application(s) your team is building or plans to build with Flink.
"We've built a platform that allows our data scientists to create and run Flink jobs and without having to worry about infrastructure or operations or management." "We use Flink in our monitoring infrastructure." "An alerting system that allows for dynamic rules. This solution must allow users to create new rules or change the existing ones to be then matched against incoming events. If there is a match, a new event will be generated as an alert to be inspected by the Analytics team. The changes to the rules must take effect without losing availability or events." "Content Analysis Platform for ingesting & feature extraction of various types of content. Using ML models, content extraction, NLP etc others." "Near-real-time stream-based crawler" "Anomaly detection engine on live events" "Simple streaming applications to transport data from Kafka to different data sinks (HDFS, Elasticsearch); complex batch applications (50 and more operators)" "Real-time predictions mixing streaming and historical data" "Time series analysis for anomaly detection and advanced data pipelines" "A time series data cache: use Flink to cache time series data until we have a complete "block" that can be sent to live in a permanent key - value store"
Are there any other challenges (with Flink) not listed above that you’d like to mention?
"Centralized error management and job scheduling/monitoring/collecting data from an external program" "Performance. It seams like operations like group by and sort could be significantly faster. For example with Supersonic in C++ you can groupby 10 columns in 1/10 of the time in takes in Flink." "Deployment of cluster for development and daily jobs" "Get all the logs from all the TM in one place (job by job, not all at once as in YARN)." "The Scala API for streaming feels very raw in comparison with e.g. the Spark scala API. The same basic operations in Flink have awkward caveats, or sometimes are only implemented in the Java API. Kryo 2 in the serialization layer is old, and unfortunately hard to work around if you're depending on a different version." "Dynamically applying rules to incoming events represents a complex challenge. There is already a solution proposed by the video game company King. The problem with King's solution is that we lose some flexibility and we need to use an external scripting language to dynamically execute the rule conditions instead of using Flink's constructs." "More commercial-strength examples for large scale deployment, operationalizing, monitoring, etc." "I feel Flink deployment (non-Docker based) can be made better especially in cloud-based environments." "A good unit testing framework" "Kerberos Security with long running applications on YARN" "Learning it. But I think the tech notes (of DataArtisan blog) and manual is OK."
What other sources / sinks not included in the list above, if any, are important for your Flink application?
"HBase" "Hive" "Elasticsearch" "MapRFS" "MySQL" "Neo4j" "Apache Druid" "Oracle database" "RabbitMQ"
We welcome any final comments about any aspect of Flink.
"You're doing a great job but now it's time to make another step ahead (in terms of number of Flink developers and support/responsiveness, from consultancy to pull requests/ticketing-housekeeping) because the community is becoming larger and larger." "I like the overall concepts a lot, especially around state management and checkpointing. I like the actual API in Scala a little less, and the implementation behind it even less (for my particular use case), but I have high hopes the latter two parts will get better over time." "Don't forget that Flink's still a very good batch processor ;)" "I am pretty excited about the power and simplicity of Apache Flink. It allows to get cases up and running with only a few operations tasks." "By far the best community. I hope I don't skew the results with my wishlist ;)" "What I like most about Flink is that it makes it very easy to solve different kinds of problems in such a simple and intuitive way." "More demo and training materials please!" "Love the vibrance and excitement of the Flink community." "I am looking forward to continue to work with Apache Flink. I am hoping that the major hadoop distributions (Hortonworks / Cloudera) will add official support in their product offerings. I am looking forward to additional Flink momentum in the USA." "I'm thrilled to be using Flink and excited about its future."
That concludes our review of the 2016 Apache Flink user survey. We hope that we were able to provide the Flink community with a broader view of Flink usage and to help identify potential areas for improvement in the coming months. The community welcomes new contributors, and if you’re interested, you can learn more here.