Mastering real-time anomaly detection with open source tools
Talk (60 min)
We'll use Apache Kafka for streaming data, Apache Flink for processing it in real time, and ARIMA models to detect unusual patterns. Whether it's spotting fraud, monitoring systems, or tracking IoT devices, this solution is flexible and reliable.
We'll start by exploring how Kafka helps collect and manage fast-moving data streams. Then, we'll demonstrate how Flink processes this data in real time and integrates ARIMA-based anomaly detection to uncover events as they occur. We'll dive into the details of how ARIMA works—covering how it models trends, seasonality, and residual noise to identify outliers in time-series data.
We'll also show how Apache Iceberg can be used to store historical data efficiently, enabling retrospective analysis, ongoing model evaluation, and performance improvements over time.
By combining real-time detection with long-term storage, you can build a robust system that evolves as your data grows. This talk includes clear examples and practical steps to help you build your own pipeline. It's ideal for anyone looking to use open-source tools to monitor and react to issues in real-time data streams.