Apache Druid Overview
Apache Druid® is a real-time analytics database designed for fast slice-and-dice analytics ("OLAP" queries) on large data sets. Most often, Druid powers use cases where real-time ingestion, fast query performance, and high uptime are important. Druid is commonly used as the database backend for GUIs of analytical applications, or for highly-concurrent APIs that need fast aggregations. Druid works best with event-oriented data. Common application areas for Druid include:
- Clickstream analytics including web and mobile analytics
- Network telemetry analytics including network performance monitoring
- Server metrics storage
- Supply chain analytics including manufacturing metrics
- Application performance metrics
- Digital marketing/advertising analytics
- Business intelligence/OLAP
Decodable + Apache Druid
Decodable provides a low-latency transport and transformation for ingesting data in a way that's matched to Apache Druid's real-time analytics. After all, there's no point running real-time queries on stale data! Druid performs much more efficiently if the data it ingests is pre-processed, and Decodable is the ideal tool to perform this transformation, as described in this blog post. For a full example using Decodable to ingest Covid19 data to Apache Pinot using Decodable for ingestion please check out the blog and accompanying demo of Decodable cleansing security logs before sending to Apache Pinot.