What’s new in the latest Apache Flink 1.7.0 release
This post originally appeared on the Apache Flink blog. It was reproduced here under the Apache License, Version 2.0.
The Apache Flink community is pleased to announce Apache Flink 1.7.0. The latest release includes more than 420 resolved issues and some exciting additions to Flink that we describe in the following sections of this post. Please check the complete changelog for more details.
Flink 1.7.0 is API-compatible with previous 1.x.y releases for APIs annotated with the
@Public annotation. The release is available now and we encourage everyone to download it and check out the updated documentation. Feedback through the Flink mailing lists or JIRA is, as always, very much appreciated! You can find the binaries on the updated Downloads page on the Flink project site.
Flink 1.7.0 – Extending the reach of Stream Processing
In Flink 1.7.0 we come closer to our goals of enabling fast data processing and building data-intensive applications for the Flink community in a seamless way.
Our latest release includes some exciting new features and improvements such as support for Scala 2.12, an exactly-once S3 file sink, the integration of complex event processing with streaming SQL and more features that we explain below.
Scala 2.12 Support in Apache Flink (FLINK-7811)
Apache Flink 1.7.0 is the first release which comes with full support for Scala 2.12. This allows users to write Flink applications with a newer Scala version and to leverage the Scala 2.12 ecosystem.
State Evolution (FLINK-9376)
In many cases, a long-running Flink application needs to evolve during its lifetime because of changing requirements.
Changing the user state without losing the current application progress in the form of its state is a crucial requirement for application evolution.
With Flink 1.7.0, the community added state evolution which allows you to flexibly adapt a long-running application’s user states schema, while maintaining compatibility with previous savepoints.
With state evolution it is possible to add or remove columns to your state schema in order to change which business features will be captured by your application after it has been deployed.
State schema evolution now works out-of-the-box when using Avro’s generated classes as user state, meaning that the schema of the state can be evolved according to Avro’s specifications.
While Avro types are the only built-in type that supports schema evolution as of Flink 1.7, the community continues working to further extend support to other types in future Flink releases.
Exactly-once S3 StreamingFileSink (FLINK-9752)
The StreamingFileSink which was introduced in Flink 1.6.0 is now extended to also support writing to S3 filesystems with exactly-once processing guarantees. Using this feature allows all S3 users to build exactly-once end-to-end pipelines writing to S3.
MATCH_RECOGNIZE Support in Streaming SQL (FLINK-6935)
This is a major addition to Apache Flink 1.7.0 that provides initial support of the MATCH_RECOGNIZE standard to Flink SQL. This feature combines both complex event processing (CEP) and SQL for easy pattern matching on data streams and, thus, enabling a whole set of new use cases.
This feature is currently in beta phase so we welcome any feedback and suggestions from the community for future iterations and improvements.
Temporal Tables and Temporal Joins in Streaming SQL (FLINK-9712)
Temporal Tables is a new concept in Apache Flink that gives a (parameterized) view on a table’s changing history and returns the content of a table at a specific point in time.
As an example, we can use a table with historical currency exchange rates. Such a table is constantly growing/evolving as time progresses and newly updated exchange rates are added. Temporal Table is a view that can return the actual state of those exchange rates to any given point of time. With such a table it is possible to convert a stream of orders in different currencies to a common currency using the correct exchange rate.
Temporal Joins allow for memory and computational-efficient joins of Streaming data with an ever-changing/updating table, using either processing time or event time, while being ANSI SQL compliant.
Miscellaneous Features for Streaming SQL
Besides the major features mentioned above, Flink’s Table & SQL API has been extended to serve more use cases.
The following built-in functions were added to the APIs:
The SQL Client now supports the definition of views both in an environment file and within a CLI session. Furthermore, basic SQL statement auto-completion has been added to the CLI.
The community added an Elasticsearch 6 table sink which allows storing updated results of a dynamic table.
Versioned REST API (FLINK-7551)
Beginning with Flink 1.7.0, the REST API is versioned. This guarantees the stability of Flink’s REST API so that third-party applications can be developed against a stable API in Flink. Thus, future Flink upgrades will not require changes to existing third-party integrations.
Kafka 2.0 Connector (FLINK-10598)
Apache Flink 1.7.0 continues to add more connectors, making it even easier to interact with more external systems. In this release, the community added the Kafka 2.0 connector which allows to read from and write to Kafka 2.0 with exactly-once guarantees.
Local Recovery (FLINK-9635)
Apache Flink 1.7.0 completes the local recovery feature by extending Flink’s scheduling to take previous deployment locations into account in case of recovery.
If local recovery is enabled Flink will keep a local copy of the latest checkpoint on the machine where the task is running. By scheduling tasks to their previous locations, Flink will, thus, minimize the network traffic for restoring state by reading checkpoint state from local disk. This feature considerably improves recovery speed.
Removal of Flink’s Legacy Mode (FLINK-10392)
Apache Flink 1.7.0 marks the release where the Flip-6 effort has been fully completed and reached feature parity with the legacy mode. Consequently, this release removes support for the legacy mode.
We would like to acknowledge all community members for contributing patches to this release. Special credits go to the following members for contributing to the 1.7.0 release (according to git):
Aitozi, Alex Arkhipov, Alexander Koltsov, Alexey Trenikhin, Alice, Alice Yan, Aljoscha Krettek, Andrei Poluliakh, Andrey Zagrebin, Ashwin Sinha, Barisa Obradovic, Ben La Monica, Benoit Meriaux, Bowen Li, Chesnay Schepler, Christophe Jolif, Congxian Qiu, Craig Foster, David Anderson, Dawid Wysakowicz, Dian Fu, Diego Carvallo, Dimitris Palyvos, Eugen Yushin, Fabian Hueske, Florian Schmidt, Gary Yao, Guibo Pan, Hequn Cheng, Hiroaki Yoshida, Igal Shilman, JIN SUN, Jamie Grier, Jayant Ameta, Jeff Zhang, Jeffrey Chung, Jicaar, Jin Sun, Joe Malt, Johannes Dillmann, Jun Zhang, Kostas Kloudas, Krzysztof Białek, Lakshmi Gururaja Rao, Liu Biao, Mahesh Senniappan, Manuel Hoffmann, Mark Cho, Max Feng, Mike Pedersen, Mododo, Nico Kruber, Oleksandr Nitavskyi, Osman Şamil AKÇELİK, Patrick Lucas, Paul Lam, Piotr Nowojski, Rick Hofstede, Rong R, Rong Rong, Sayat Satybaldiyev, Sebastian Klemke, Seth Wiesman, Shimin Yang, Shuyi Chen, Stefan Richter, Stephan Ewen, Stephen Jason, Thomas Weise, Till Rohrmann, Timo Walther, Tzu-Li “tison” Chen, Tzu-Li (Gordon) Tai, Tzu-Li Chen, Wosin, Xingcan Cui, Xpray, Xue Yu, Yangze Guo, Ying Xu, Yun Tang, Zhijiang, blues Zheng, hequn8128, ifndef-SleePy, jerryjzhang, jrthe42, jyc.jia, kkolman, lihongli, linjun, linzhaoming, liurenjie1024, liuxianjiao, lrl, lsy, lzqdename, maqingxiang, maqingxiang-it, minwenjun, shuai-xu, sihuazhou, snuyanzin, wind, xuewei.linxuewei, xueyu, xuqianjin, yanghua, yangshimin, zhijiang, 谢磊, 陈梓立