🧪 Virtual Hands-On Lab: Introduction to Real-time ETL

September 6, 2024

min read

Seven Must-See Talks at Current 2024

Share this post

It’s just two more weeks until Current ‘24! Organized by the fine folks over at Confluent, it’s the event of the year in the data streaming and processing space. This year, Current is back to Austin,TX, on September 17-18.

As always, the agenda looks great, with two days fully packed with talks around real-time and event-driven architectures, Apache Kafka and Flink, change data capture, streaming analytics, and much more. It’s going to be really hard to choose which sessions to attend, but below are a few ones which piqued my interest and which I’ll definitely try to attend.

Are you going to be at Current too? If so, let’s meet and chat on real-time data streaming, stream processing, and everything in between! If you don’t find me at the Decodable booth, you can catch me at my talk (details below), in Breakout Room 6 where I’m going to be a room host on Wednesday morning, or roaming the conference halls in between.

Atomic Dual-write Recipes with Kafka Two Phase Commit (KIP-939)

Wed Sep 18, 4:00 PM - 4:45 PM

It’s a common requirement for applications to update state within their own (typically relational) database while also sending a message to Kafka, for instance to notify other services about that data change. But without transactions spanning across the database and Kafka, doing so is prone to failure and should be avoided (remember, friends don’t let friends do dual writes). KIP-939 is promising to change this, enabling Kafka to participate in distributed two-phase transactions. This is gonna be a game changer for Kafka, and I can’t wait to learn more about the details in this session by Confluent’s Artem Livshits.

Streaming Queries Without Compromise

Wed Sep 18, 4:00 PM - 4:45 PM

One of the most exciting data research papers last year was DBSP: Automatic Incremental View Maintenance for Rich Query Languages, providing a fresh look at the problem of efficient incremental view maintenance (IVM) and winning the best research paper award at the 2023 conference of Very Large Databases (VLDB). This session by Mihai Budiu and Leonid Ryzhyk of Feldera discusses the core ideas behind DBSP and should be an absolute must watch for everyone interested in IVM.

Optimizing Apache Kafka with GraalVM: Faster, Leaner, and Performant

Wed Sep 18, 3:00 PM - 3:45 PM

Who doesn’t love their Kafka brokers to start faster and consume less memory? Thanks to ahead-of-time compilation with GraalVM, this is not a dream any longer, and it’s just great to see that the Kafka project has picked up this technology, providing a container image with Kafka compiled into a native binary since version 3.7.0. A while ago, I took this for a spin, and Kafka would start up in 150 ms on my laptop. So great for integration testing, for instance! Join Krishna Agarwal and Vedarth Sharma of Confluent to learn all about this cool feature.

Flinking Enrichment: Shouldn't This Be Easier?

Wed Sep 18, 2:00 PM - 2:45 PM

Enriching events on a Kafka topic with contextual data is one very common use case for stream processing with Apache Kafka. There’s multiple ways for doing so, including Flink’s DataStream API with AsyncIO, Flink SQL joins, and others. What are the pros and cons of the different approaches, when to use which one, which pitfalls and traps you should avoid? David Anderson of Confluent is going to answer all these questions, and surely more, in this talk which I think will be a great watch for all you Flink practitioners out there.

Sentiment Analysis in Action with Apache Flink: Building Your Real-time Pipeline

Tue Sep 17, 3:00 PM - 3:45 PM

Are we at peak AI yet? I don’t really know, but for sure we see more and more AI-related use cases also around data streaming, be it feeding data from operational data stores to vector databases, building retrieval-augmented generation (RAG) architectures, or real-time sentiment analysis of streaming data, as in this talk by Aiven’s Olena Kutsenko. If you’re curious about how to build your own real-time API data pipeline using Apache Kafka and Flink, this is a session you should check out.

Decodable at Current

Current will also be a great opportunity for you to meet with the Decodable crew. We’re a Gold sponsor for the event and you can look forward to the following talks by the team (in chronological order):

Data Contracts In Practice With Debezium and Apache Flink by yours truly (Tue Sep 17, 3:00 PM - 3:45 PM)
Timing is Everything: Understanding Event-Time Processing in Flink SQL by Sharon Xie (Tue Sep 17, 4:00 PM - 4:45 PM)
So You Want to Write a User-Defined Function (UDF) for Flink? by Hans-Peter Grahsl (Wed Sep 18, 1:30 PM - 1:40 PM)
The Joy of JARs (and Other Flink SQL Troubleshooting Tales) by Robin Moffatt (Wed Sep 18, 3:00 PM - 3:45 PM)

Check out our booth, we’d love to learn about your data streaming and processing use cases, answer your questions, or give you a live demo. Plus, kick off the conference with a coffee on us in the expo hall from 8:30am - 12:30pm on Tuesday! And I’ve heard there’s gonna be some really cool swag too.

Change Data Capture & Kafka: How Slack Transitioned to CDC with Debezium & Kafka Connect

Wed Sep 18, 4:00 PM - 4:45 PM

“The CDC pipeline slashed cost by millions and slashed latency from 24 hours to less than 10 minutes.” Do you need any more reasons for adopting Debezium and CDC? Having worked on Debezium for several years, I am always loving to learn about success stories from people adopting change data capture, in particular when it’s such large scale users like Slack, who also contributed to Debezium’s connector for Vitess. Really looking forward to this session by Slack engineers Joseph Thaidigsman and Tom Thornton.

Addressing Streaming ETL Pipelines Challenges: Delving into Flink CDC

Wed Sep 18, 5:00 PM - 5:45 PM

Flink CDC has made a mature leap forward with its version 3.0, evolving into a complete end-to-end solution for data pipelines, providing powerful solutions for (re-)snapshotting specific tables, handling schema changes, and more. I haven’t seen an awful lot of information around it, besides the initial announcement and a handful of blog posts, so I am really looking forward to this session by Xu Bangjiang of Alibaba Cloud. I might have some questions about that usage of YAML, though ;)

Enabling Flink's Cloud-Native Future: Introducing Disaggregated State in Flink 2.0

Tue Sep 17, 5:00 PM - 5:45 PM

Relying on node-local state for checkpoints and savepoints isn’t ideal when running Apache Flink in cloud environments, for instance when considering containerized Flink workloads which may be moved around compute nodes. The Flink community is currently in the process of addressing these challenges with FLIP-423, targeting Flink 2.0. Yuan Mei of Alibaba is going to provide an overview of this work in this session, which should be highly relevant to everyone running Flink in the cloud, on Kubernetes, etc.

Community Events

Besides the large number of regular break-out sessions, there will also be a variety of community events. I am looking forward to the Apache Flink® Ask Me Anything session at the Meetup Hub, with folks from Confluent, LinkedIn, and Apple being ready to answer all your questions on the popular stream processing platform. There’ll be the announcement of the Data Streaming Awards, an unofficial 5K run/walk un-organized by my fellow Decoder Robin Moffatt, and much more.

As you can see, quite a few cool things are in store for Current ‘24. I can’t wait for the community to come together and am looking forward very much to seeing you in Austin!

📫 Email signup 👇

Did you enjoy this issue of Checkpoint Chronicle? Would you like the next edition delivered directly to your email to read from the comfort of your own home?

Simply enter your email address here and we'll send you the next issue as soon as it's published—and nothing else, we promise!

👍 Got it!

Oops! Something went wrong while submitting the form.

Gunnar Morling

Gunnar is an open-source enthusiast at heart, currently working on Apache Flink-based stream processing. In his prior role as a software engineer at Red Hat, he led the Debezium project, a distributed platform for change data capture. He is a Java Champion and has founded multiple open source projects such as JfrUnit, kcctl, and MapStruct.

May 9, 2022

min read

Powered by Apache Flink and Debezium, Decodable is a real-time data platform that unifies ELT, ETL, and stream processing.

Start Free Talk To An Expert

Heading 2

Atomic Dual-write Recipes with Kafka Two Phase Commit (KIP-939)

Wed Sep 18, 4:00 PM - 4:45 PM

Streaming Queries Without Compromise

Wed Sep 18, 4:00 PM - 4:45 PM

Optimizing Apache Kafka with GraalVM: Faster, Leaner, and Performant

Wed Sep 18, 3:00 PM - 3:45 PM

Flinking Enrichment: Shouldn't This Be Easier?

Wed Sep 18, 2:00 PM - 2:45 PM

Sentiment Analysis in Action with Apache Flink: Building Your Real-time Pipeline

Tue Sep 17, 3:00 PM - 3:45 PM

Decodable at Current

Data Contracts In Practice With Debezium and Apache Flink by yours truly (Tue Sep 17, 3:00 PM - 3:45 PM)
Timing is Everything: Understanding Event-Time Processing in Flink SQL by Sharon Xie (Tue Sep 17, 4:00 PM - 4:45 PM)
So You Want to Write a User-Defined Function (UDF) for Flink? by Hans-Peter Grahsl (Wed Sep 18, 1:30 PM - 1:40 PM)
The Joy of JARs (and Other Flink SQL Troubleshooting Tales) by Robin Moffatt (Wed Sep 18, 3:00 PM - 3:45 PM)

Change Data Capture & Kafka: How Slack Transitioned to CDC with Debezium & Kafka Connect

Wed Sep 18, 4:00 PM - 4:45 PM

Addressing Streaming ETL Pipelines Challenges: Delving into Flink CDC

Wed Sep 18, 5:00 PM - 5:45 PM

Enabling Flink's Cloud-Native Future: Introducing Disaggregated State in Flink 2.0

Tue Sep 17, 5:00 PM - 5:45 PM

Community Events

As you can see, quite a few cool things are in store for Current ‘24. I can’t wait for the community to come together and am looking forward very much to seeing you in Austin!

📫 Email signup 👇

Did you enjoy this issue of Checkpoint Chronicle? Would you like the next edition delivered directly to your email to read from the comfort of your own home?

Simply enter your email address here and we'll send you the next issue as soon as it's published—and nothing else, we promise!

Gunnar Morling

Let's get decoding

Decodable is free. No CC required. Never expires.

Start for Free Talk to an Expert Join the Community on Slack

Seven Must-See Talks at Current 2024

Atomic Dual-write Recipes with Kafka Two Phase Commit (KIP-939)

Streaming Queries Without Compromise

Optimizing Apache Kafka with GraalVM: Faster, Leaner, and Performant

Flinking Enrichment: Shouldn't This Be Easier?

Sentiment Analysis in Action with Apache Flink: Building Your Real-time Pipeline

Decodable at Current

Change Data Capture & Kafka: How Slack Transitioned to CDC with Debezium & Kafka Connect

Addressing Streaming ETL Pipelines Challenges: Delving into Flink CDC

Enabling Flink's Cloud-Native Future: Introducing Disaggregated State in Flink 2.0

Community Events

📫 Email signup 👇

Did you enjoy this issue of Checkpoint Chronicle? Would you like the next edition delivered directly to your email to read from the comfort of your own home?

Related Posts

Flink Deployments At Decodable

Getting Started with Apache Flink and Flink SQL

Comparing Apache Flink and Spark for Modern Stream Data Processing

Table of contents

Atomic Dual-write Recipes with Kafka Two Phase Commit (KIP-939)

Streaming Queries Without Compromise

Optimizing Apache Kafka with GraalVM: Faster, Leaner, and Performant

Flinking Enrichment: Shouldn't This Be Easier?

Sentiment Analysis in Action with Apache Flink: Building Your Real-time Pipeline

Decodable at Current

Change Data Capture & Kafka: How Slack Transitioned to CDC with Debezium & Kafka Connect

Addressing Streaming ETL Pipelines Challenges: Delving into Flink CDC

Enabling Flink's Cloud-Native Future: Introducing Disaggregated State in Flink 2.0

Community Events

📫 Email signup 👇

Did you enjoy this issue of Checkpoint Chronicle? Would you like the next edition delivered directly to your email to read from the comfort of your own home?

Related Posts

Flink Deployments At Decodable

Getting Started with Apache Flink and Flink SQL

Comparing Apache Flink and Spark for Modern Stream Data Processing

Let's get decoding