It’s just two more weeks until Current ‘24! Organized by the fine folks over at Confluent, it’s the event of the year in the data streaming and processing space. This year, Current is back to Austin,TX, on September 17-18.
As always, the agenda looks great, with two days fully packed with talks around real-time and event-driven architectures, Apache Kafka and Flink, change data capture, streaming analytics, and much more. It’s going to be really hard to choose which sessions to attend, but below are a few ones which piqued my interest and which I’ll definitely try to attend.
Are you going to be at Current too? If so, let’s meet and chat on real-time data streaming, stream processing, and everything in between! If you don’t find me at the Decodable booth, you can catch me at my talk (details below), in Breakout Room 6 where I’m going to be a room host on Wednesday morning, or roaming the conference halls in between.
Atomic Dual-write Recipes with Kafka Two Phase Commit (KIP-939)
Wed Sep 18, 4:00 PM - 4:45 PM
It’s a common requirement for applications to update state within their own (typically relational) database while also sending a message to Kafka, for instance to notify other services about that data change. But without transactions spanning across the database and Kafka, doing so is prone to failure and should be avoided (remember, friends don’t let friends do dual writes). KIP-939 is promising to change this, enabling Kafka to participate in distributed two-phase transactions. This is gonna be a game changer for Kafka, and I can’t wait to learn more about the details in this session by Confluent’s Artem Livshits.
Streaming Queries Without Compromise
Wed Sep 18, 4:00 PM - 4:45 PM
One of the most exciting data research papers last year was DBSP: Automatic Incremental View Maintenance for Rich Query Languages, providing a fresh look at the problem of efficient incremental view maintenance (IVM) and winning the best research paper award at the 2023 conference of Very Large Databases (VLDB). This session by Mihai Budiu and Leonid Ryzhyk of Feldera discusses the core ideas behind DBSP and should be an absolute must watch for everyone interested in IVM.
Optimizing Apache Kafka with GraalVM: Faster, Leaner, and Performant
Wed Sep 18, 3:00 PM - 3:45 PM
Who doesn’t love their Kafka brokers to start faster and consume less memory? Thanks to ahead-of-time compilation with GraalVM, this is not a dream any longer, and it’s just great to see that the Kafka project has picked up this technology, providing a container image with Kafka compiled into a native binary since version 3.7.0. A while ago, I took this for a spin, and Kafka would start up in 150 ms on my laptop. So great for integration testing, for instance! Join Krishna Agarwal and Vedarth Sharma of Confluent to learn all about this cool feature.
Flinking Enrichment: Shouldn't This Be Easier?
Wed Sep 18, 2:00 PM - 2:45 PM
Enriching events on a Kafka topic with contextual data is one very common use case for stream processing with Apache Kafka. There’s multiple ways for doing so, including Flink’s DataStream API with AsyncIO, Flink SQL joins, and others. What are the pros and cons of the different approaches, when to use which one, which pitfalls and traps you should avoid? David Anderson of Confluent is going to answer all these questions, and surely more, in this talk which I think will be a great watch for all you Flink practitioners out there.
Sentiment Analysis in Action with Apache Flink: Building Your Real-time Pipeline
Tue Sep 17, 3:00 PM - 3:45 PM
Are we at peak AI yet? I don’t really know, but for sure we see more and more AI-related use cases also around data streaming, be it feeding data from operational data stores to vector databases, building retrieval-augmented generation (RAG) architectures, or real-time sentiment analysis of streaming data, as in this talk by Aiven’s Olena Kutsenko. If you’re curious about how to build your own real-time API data pipeline using Apache Kafka and Flink, this is a session you should check out.
Change Data Capture & Kafka: How Slack Transitioned to CDC with Debezium & Kafka Connect
Wed Sep 18, 4:00 PM - 4:45 PM
“The CDC pipeline slashed cost by millions and slashed latency from 24 hours to less than 10 minutes.” Do you need any more reasons for adopting Debezium and CDC? Having worked on Debezium for several years, I am always loving to learn about success stories from people adopting change data capture, in particular when it’s such large scale users like Slack, who also contributed to Debezium’s connector for Vitess. Really looking forward to this session by Slack engineers Joseph Thaidigsman and Tom Thornton.
Addressing Streaming ETL Pipelines Challenges: Delving into Flink CDC
Wed Sep 18, 5:00 PM - 5:45 PM
Flink CDC has made a mature leap forward with its version 3.0, evolving into a complete end-to-end solution for data pipelines, providing powerful solutions for (re-)snapshotting specific tables, handling schema changes, and more. I haven’t seen an awful lot of information around it, besides the initial announcement and a handful of blog posts, so I am really looking forward to this session by Xu Bangjiang of Alibaba Cloud. I might have some questions about that usage of YAML, though ;)
Enabling Flink's Cloud-Native Future: Introducing Disaggregated State in Flink 2.0
Tue Sep 17, 5:00 PM - 5:45 PM
Relying on node-local state for checkpoints and savepoints isn’t ideal when running Apache Flink in cloud environments, for instance when considering containerized Flink workloads which may be moved around compute nodes. The Flink community is currently in the process of addressing these challenges with FLIP-423, targeting Flink 2.0. Yuan Mei of Alibaba is going to provide an overview of this work in this session, which should be highly relevant to everyone running Flink in the cloud, on Kubernetes, etc.
Community Events
Besides the large number of regular break-out sessions, there will also be a variety of community events. I am looking forward to the Apache Flink® Ask Me Anything session at the Meetup Hub, with folks from Confluent, LinkedIn, and Apple being ready to answer all your questions on the popular stream processing platform. There’ll be the announcement of the Data Streaming Awards, an unofficial 5K run/walk un-organized by my fellow Decoder Robin Moffatt, and much more.
As you can see, quite a few cool things are in store for Current ‘24. I can’t wait for the community to come together and am looking forward very much to seeing you in Austin!