Use Case Track

Upshot: distributed tracing using Flink

Distributed tracing is used to analyze performance and error cases in service oriented architectures. The Observability team at Airbnb recently created Upshot, a data pipeline that uses Flink to analyze over 40 million trace events per minute. Summaries of the resulting data are sent to Druid, Datadog, and other downstream datastores. This talk will focus on how we use Flink and how we analyzed and addressed scaling issues we encountered while building Upshot.

Authors

Brian Wolfe
Brian Wolfe
Airbnb
Brian Wolfe

Brian Wolfe is a software engineer on the Observability team at Airbnb. While working on Observability, he helped create several streaming pipelines to perform custom processing and deep introspection of performance and errors in production infrastructure. This work has saved hundreds of hours of engineering time spent diagnosing and remediating issues in production. Before working at Airbnb, he developed software for home robotics and algorithms for designing biomolecules.