How Kcell and GetInData built a stream analytics platform using Apache Flink® as a core component.
Telecommunications companies, such as Kcell, hold vast amounts of data at their disposal. Whether arriving from billing, data usage, roaming, or location events, being able to access and react instantly to these streams of data is paramount for modern enterprises.
GetInData worked with Kcell on developing an innovative real-time platform for complex event processing that enables the telecommunications provider to build different applications on top of it.
Example use cases of this new real-time platform:
User experience is one of the most important service elements of Kcell’s offering. Imagine a situation where a subscriber tops-up his or her balance, very often in a short period of time. This might be due to intensive internet consumption even if the user is not subscribed to any special tariff, as one example. The platform can recognize this type of subscriber behavior and send a text message with a tailored offer. With real-time analytics, Kcell can do this instantly, as the subscriber is still going through the balance top-up process that leads to higher chances for successful message response and increased subscriber satisfaction.
Fraud detection while roaming
Detecting fraudulent actions and responding to them in real time can prevent significant losses for a telecommunications service provider and provide a better user experience. Let’s imagine a situation where one of Kcell’s subscribers registers to a roaming service while his or her balance is zero, a scenario that is impossible in standard use cases. Using a real-time streaming platform allows Kcell to act on those scenarios by, for example, instantly sending a notification to the anti-fraud unit for follow up.
The solutionGetInData and Kcell worked collaboratively on deploying a new solution using complex event processing with Apache FlinkⓇ as its core event processing engine.
Apache Flink was the best fit due to the following features of the framework:
- Flexible event time support
- Fault-tolerant handling of state, thanks to the algorithm of consistent distributed snapshots
- Large state sizes because the state can be stored out of the JVM, for example in a RocksDB state backend
- High-throughput & low-latency capabilities that matches the scale of Kcell’s problem
Additional components of the solution include monitoring and logging for performance measurement of the stream analytics platform and debugging of any potential issues.
Solution implementation and technical requirementsBefore starting the deployment, extensive planning and prioritization took place to ensure the system would be successful and provide clear ROI to Kcell.
A few requirements that were taken into account in the approach were the following:
- Reliable and 24/7 stream processing
Another crucial element in the Kcell case was the variety of data the company holds.
Ingesting data from different sources, and in different formats, such as billing, data usage, roaming or location events makes it necessary for the streaming platform to be reliable and flexible, allowing an appropriate response to different use cases and types of applications.
- Accuracy, flexibility, and usability
In addition, the system should be flexible, allowing for separate layers of incoming and outgoing data pipelines with elasticity in mind.
Finally, the system needs to be user-friendly and easy-to-configure for Kcell’s engineers that have various levels of programming proficiency.
- Constant end-to-end testing and monitoring
On top of that, the team built a monitoring system including metrics and logging. Metrics are collected into InfluxDB, a dedicated database for time series data. Grafana is used as a metrics visualization and alerting tool. In a similar manner, logs are collected from all components using the widely known Elasticsearch + Logstash + Kibana stack for debugging purposes.
To learn more about the implementation of continuous testing and metrics and logging systems, click here.
To find out more about the challenges the team was able to overcome throughout the implementation process, read more here.
Founded in 2014 by former Spotify data engineers, GetInData is a Warsaw-based company that helps other organizations implement their Big Data projects by providing outsourcing, consulting and training services. GetInData consists of a team of passionate veterans in Big Data and Fast Data Analytics. The company has a multi-year experience in implementing Big Data projects with various customers ranging from fast-growing startups to global corporations from the telecommunications, banking, retail, media, and healthcare sectors.
GetInData’s people strive for a continuous improvement and challenge of the status quo in data analytics technologies to help customers achieve true ROI from their data processing .
Kcell is part of the largest Scandinavian telecommunications holding – TeliaCompany (www.teliacomany.com). The company was founded in 1998 and is now a market leader in Kazakhstan, providing cellular services under the brand name of Kcell, which mainly focuses on the B2B segment and active, which focuses on the B2C segment. As of December 31st, 2016 Kcell’s subscriber base amounted to almost 10 million people.
This article was initially posted on GetInData’s blog here. Special credits go to the project team at Kcell and GetInData for allowing us to use the story on the data Artisans blog and to Alexey Brodovshuk, Senior Software Engineer and Technical Lead at Kcell.
Apache Flink, Flink®, Apache®, the squirrel logo, and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation.