🧪 Virtual Hands-On Lab: Introduction to Real-time ETL

February 18, 2025

min read

Managed Flink: What Matters Most

Share this post

In recent years, the landscape of data infrastructure technology has undergone a transformative shift as companies deploy new workloads powered by real-time applications and services. This evolution has resulted in a demand for instantaneous data processing and analytics at scale across various business sectors.

Within this rapidly evolving ecosystem, navigating the myriad of service providers and solution offerings can be daunting. This guide will aid in this decision-making process and provide the necessary information to select the right solution for your requirements.

Why teams choose Managed Flink

Flink has emerged as the de facto standard for stream processing, gaining widespread adoption among some of the industry’s most prominent innovators, including Alibaba, Uber, and Netflix. Continuously growing in adoption due to its performance, fault tolerance, and scalability, Flink has become the backbone of many real-time data processing pipelines.

The open-source nature of Flink offers unparalleled flexibility and innovation potential. Still, it also presents a significant challenge for organizations: the need for a highly skilled and specialized workforce to effectively use its capabilities. Building and maintaining the infrastructure required to support real-time applications powered by Flink demands a team of expert engineers proficient in distributed systems, data engineering, and stream processing. It is also necessary to have specialists integrate other projects and their capabilities, such as Debezium, to support change data capture (CDC).

This poses a dilemma for many organizations, as the reality is that only a select few possess the resources and expertise necessary to tackle the complexities of Flink at scale. For the vast majority of companies, this represents a formidable barrier to entry for supporting real-time data processing.

In response to this challenge, many companies turn to vendors and service providers who offer tailored solutions to simplify the adoption and management of Flink-based applications. By using the expertise and resources of these vendors, organizations can overcome the barriers posed by Flink’s complexity and focus on using its capabilities to drive innovation and growth.

Managed Flink Options

Running a successful data infrastructure technology stack at scale in production requires several components beyond the core open-source software provided by Flink and Debezium. Different service providers have created offerings that target a wide range of these needs. These range from the lowest infrastructure layer that supports a DIY approach up to a fully managed platform offering, with multiple levels of cloud-hosted options in the middle.

Let’s explore these options along with what is offered at the different levels, what remains for you and your team to create and manage after adopting each one, and the pros and cons.

Do It Yourself on Cloud Infrastructure

For companies who pursue the DIY approach and build and run their data platform from scratch, this most commonly starts with a cloud service provider to provision the infrastructure layer (IaaS).

At this level, you’re getting the raw compute, storage, and networking services, as well as provisioning capabilities such as Kubernetes. The rest of the architecture, including deploying Flink and building and managing all the other aspects of your data platform, is entirely up to you.

To support your platform for running business-critical workloads in production, you’ll need to have the right staffing in place for site reliability engineering (SRE) and platform engineering. In addition to keeping the infrastructure running and optimized, you’ll need data platform specialists with expertise in Flink and Debezium to deploy, manage, and optimize these powerful but complex components.

Flink-as-a-Service

Some cloud service providers offer a basic hosted Flink service. While this is incredibly useful, it’s not generally sufficient for the majority of customers and leaves quite a number of challenges for teams to solve on their own.

In this scenario, you handle managing the connectors, versioning and security patches, and many of the other details of Flink. State management, optimization, observability, security, and compliance—these are all tasks that you’ll need to provision and resource a data platform engineering team to handle.

In addition, no tooling or developer experience support is provided. You have to create this internally as needed.

Managed Provider

Managed Flink providers go even higher up the data stack. In addition to Flink, they may offer Debezium and other connectors, and they can provide services such as job control, resource management, and security features.

Again, while these are good offerings, they still don’t provide the complete picture that customers want and need to be successful.

For instance, a managed provider might give you a schema registry. However, you’re still responsible for metadata beyond schemas, such as:

Tracking the semantics of Kafka topics
Understanding the difference between change data capture versus append-only streams
Creating and maintaining your own data catalog.

A managed provider might not provide certain features that many organizations deem critical. For example, you still need the ability to configure, maintain, and support the platform, implement the integration between external systems, and offer the flexibility to create custom Flink jobs in the language of your choice. You may also need to be able to optimize processing jobs and scale resource availability up and down for your tasks and workloads to manage operational costs more efficiently.

Fully Managed Real-Timed Data Platform

The most complete solutions for real-time data are provided by those who can offer a fully managed platform, one that integrates all the necessary components so that your stream processing jobs just work. In addition to Flink as the base layer, they provide the observability that production environments demand, the security features that protect your data, and the controls you need to run your workloads efficiently - all without burdening you with internal configuration settings that need endless fine-tuning to keep the platform running smoothly.

At this level, the focus is on implementing your business logic, whether that’s in SQL, Java, or Python. No pre-existing Flink knowledge is assumed or required, although it should be your choice. (If you have Flink JAR files you need to run, that needs to be fully supported.)

If you’d prefer to use SQL, you don’t have to write any Java code. The developer experience also includes a web UI, a robust CLI tool, a dbt adapter, and a unified set of APIs for the entire platform so that it works seamlessly within your established ecosystem.

Evaluating the options

Let’s evaluate the several different levels of service being offered:

Do It Yourself on Cloud Infrastructure: At this level, you’re getting the raw compute, storage, and networking services, as well as provisioning capabilities such as Kubernetes. The rest of the architecture, including deploying Flink and building and managing all the other aspects of your data platform, is entirely up to you.

Flink-as-a-Service. While useful, this isn’t generally sufficient for the majority of customers and leaves quite several challenges for teams to solve on their own. You’re being asked to handle the connections to external data systems, to make sure you get the versions right while keeping up with security patches, and to configure many of the details of Flink.

Managed Provider. In addition to Flink, managed providers may offer Debezium and other connectors, and they can provide services such as job control, resource management, and security features. Again, while these are good offerings, they still do not provide the complete picture that customers want and need to be successful.

Fully-Managed Real-Time Data Platform. The most complete solutions for real-time data are provided by those who can offer a fully managed platform, one that integrates all the necessary components so that your stream processing jobs just work. In addition to Flink as the base layer, they provide the observability that production environments demand, the security features that protect your data, and the controls you need to run your workloads efficiently. This frees you up to focus on your business logic.

What Decodable Offers

Decodable is a fully managed, real-time data platform powered by Apache Flink and Debezium. We’ve purposely built Decodable to meet the needs of different audiences. Our job is to make people productive without getting into the nuts and bolts of what it means to get the most out of Flink.

Here’s what Decodable brings to the table:

Efficient, Secure, and Proven

Powered by Flink and Debezium, Decodable can deliver our services in an efficient and secure way with trusted open-source projects that are well-known to work in production and to support mission-critical applications. Our experience with these projects enables us to configure our platform to take full advantage of the underlying hardware, such as the AWS Graviton architecture.

Robust Connector Library

Like many other tech companies in the data space, we have a connector library. The primary difference with Decodable is the depth to which we go in building each and every connector in our library. Notably, the focus is on depth rather than breadth. Our goal is to support these external data systems as thoroughly as possible.

We strive to provide robust support for different systems, including capabilities such as change data capture (CDC), without forcing you to worry about the boundary between Flink and Debezium. Our connectors seamlessly handle that level of detail for you.

Automatic Configuration, Tuning, and Scaling

The Decodable platform also provides a rich set of automatic configuration, tuning, and scaling capabilities. And while the underlying technologies provide some of that, a great deal comes from a team of seasoned experts. They all have experience with being woken up in the middle of the night to address issues with problematic settings in Flink or misconfigured jobs.

For teams tasked with running it themselves, there is a whole class of Flink challenges that are less obvious, where you need people who deeply understand how it works and how to debug it.

Our Cloud or Yours

There are two options for customers taking advantage of the Decodable platform. The simplest way to get up and running quickly is our fully managed offering, where everything runs inside Decodable’s cloud environment.

For more complex or compliance-constrained environments, we also support “bring your own cloud.” In this case, the data plane of the platform—the part that runs the connections, processes the event streams, handles encryption keys, and isolates all of the data—runs inside of your VPCs.

Designed for All Developers

Working with the Decodable platform is very similar to working with AWS, Azure, or other cloud services, with several options for interaction. There’s a web-based UI that you can use for data exploration and building processing jobs, first-class APIs that expose all of the platform’s features and capabilities, a command line interface for working directly in a terminal, support for declarative YAML configuration files for use in CI/CD and version control workflows, and even a dbt adapter to integrate with that ecosystem.

Decodable lets you focus on implementing your business logic, whether that’s in SQL, Java, or Python.

The Support You Need

Decodable offers multiple tiers of customer support, up to and including dedicated technical support for enterprise customers. For every customer, our team of experts is your first line of defense to ensure the platform is operating reliably and optimally.

If there are node failures or checkpoints that are not keeping up with your jobs, we can take mitigating steps without getting you involved, assuming the necessary changes do not change the cost profile. We can also proactively reach out when we see something is wrong and suggest possible causes that you can either look into on your own. Or, we can collaborate with you to resolve.

Conclusion

Organizations have a wealth of options for services to support their Flink-based real-time data systems. While DIY, Flink-as-a-Service, and managed providers have some benefits, a fully managed platform offers customizable solutions that can be used directly with SQL, Python, or Java.

Decodable offers a simplified, unified approach to real-time data with a fully managed, serverless platform that eliminates the complexity and overhead of managing infrastructure. It allows data engineering teams to focus on building pipelines using SQL, Java, or Python.

With a wide range of connectors for seamless data movement between systems and powerful stream processing capabilities, Decodable enables developing real-time pipelines with ease. The platform ensures operational reliability through built-in security, compliance, and a dedicated support team that acts as the first line of defense for your data pipelines.

📫 Email signup 👇

Did you enjoy this issue of Checkpoint Chronicle? Would you like the next edition delivered directly to your email to read from the comfort of your own home?

Simply enter your email address here and we'll send you the next issue as soon as it's published—and nothing else, we promise!

👍 Got it!

Oops! Something went wrong while submitting the form.

David Fabritius

January 2, 2024

min read

Powered by Apache Flink and Debezium, Decodable is a real-time data platform that unifies ELT, ETL, and stream processing.

Get the Technical Guide Watch Our Tech Talk

Heading 2

Why teams choose Managed Flink

Managed Flink Options

Let’s explore these options along with what is offered at the different levels, what remains for you and your team to create and manage after adopting each one, and the pros and cons.