ClickHouse

Database
Data Warehouse
Analytics
Sink

ClickHouse Overview

ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP). Different orders for storing data are better suited to different scenarios. The data access scenario refers to what queries are made, how often, and in what proportion; how much data is read for each type of query – rows, columns, and bytes; the relationship between reading and updating data; the working size of the data and how locally it is used; whether transactions are used, and how isolated they are; requirements for data replication and logical integrity; requirements for latency and throughput for each type of query, and so on. The higher the load on the system, the more important it is to customize the system set up to match the requirements of the usage scenario, and the more fine-grained this customization becomes. There is no system that is equally well-suited to significantly different scenarios. If a system is adaptable to a wide set of scenarios, under a high load, the system will handle all the scenarios equally poorly, or will work well for just one or few of possible scenarios.

Decodable + ClickHouse

Benchmarked at 100x faster than Hive or MySQL, ClickHouse is adopted by many engineering teams to serve queries at very low latencies across large datasets. To achieve this performance at scale compared to standard data warehouses like Snowflake, it makes some architectural tradeoffs that users should keep in mind.  For this reason, users should consider conforming the data to ClickHouse best practices before ingestion. Decodable makes it simple to prep the data so ClickHouse performs at its best. For more insight into what data preparation Decodable can perform, read our blog. The following video walks through a full scenario synchronizing MySQL data (using a CDC connector) to ClickHouse.