Apache Hudi

Data Lake
Sink

Apache Hudi Overview

Apache Hudi is a next generation streaming data lake platform. Apache Hudi brings core warehouse and database functionality directly to a data lake. Hudi provides tables, transactions, efficient upserts/deletes, advanced indexes, streaming ingestion services, data clustering/compaction optimizations, and concurrency all while keeping your data in open source file formats.

Apache Hudi can easily be used on any cloud storage platform. Hudi’s advanced performance optimizations, make analytical workloads faster with any of the popular query engines including, Apache Spark, Flink, Presto, Trino, Hive, etc.

Decodable + Apache Hudi

As with many of the newer specialized databases, Apache Hudi operates best when low-latency data is presented for ingestion, optimized for Hudi. Decodable is the ideal transport and transformation pipeline to get data from where it already exists (database, messaging systems, API) in various formats into Hudi so it can do its job as effectively as possible.