4 minute read · March 19, 2025

Streaming Data, Instant Insights: Real-Time Analytics with Dremio & Confluent Tableflow

Casey Karst

Casey Karst · Director of Product Management

Breaking Down Barriers Between Streaming Data and Analytics

Dremio is proud to be a launch partner for Tableflow, Confluent’s newest innovation that seamlessly transforms Apache Kafka streams into Apache Iceberg tables. This integration eliminates the complexity of making real-time operational data instantly available for analytics via Apache Iceberg tables - without the need for brittle ETL pipelines.

For too long, organizations have struggled to combine real-time streaming data from Kafka with historical batch data stored in data lakes and warehouses. Businesses need both:

• Streaming data providing fresh, real-time insights, but it lacks structure and consistency for analytical workloads.

• Lakehouses and warehouses that offer scalable and cost-effective repositories for historical data, but ingesting fresh operational data is slow and expensive.

Bridging this gap traditionally required custom data pipelines, schema mappings, and costly transformations. This leads to duplicated data, increased latency, and operational overhead.

With Confluent Tableflow and Dremio, businesses can query real-time and historical data together in an open lakehouse architecture providing insights at the speed of operational data.

Seamless Data Ingestion: What is Tableflow?

Tableflow automates the process of converting Kafka topics into Iceberg tables, ensuring that streaming data is available in a structured, analytics-ready format without requiring manual intervention.

How Tableflow Works

• Kafka Topics Become Tables – Converts raw Kafka events into Iceberg tables in a single step, eliminating the need for complex transformation jobs.

• Schema Evolution Built In – Automatically applies schema changes using Confluent’s Schema Registry, ensuring compatibility with existing data models.

• Automated File Compaction – Continuously optimizes small Parquet files for better query performance and reduced storage costs.

• Native Iceberg Integration – Data lands directly in an Iceberg table that is immediately accessible via Dremio.

Tableflow removes the need for multiple ingestion jobs, batch transformations, and manual clean-up, making Kafka streams fully operational in the lakehouse without added complexity.

Accelerate Decision-Making with Dremio & Tableflow

Dremio’s native Apache Iceberg support makes it the ideal analytics engine for organizations leveraging Confluent Tableflow. This integration unlocks significant business value:

Fast Query Performance – Dremio’s high-performance query engine delivers sub-second analytics on Iceberg tables generated by Tableflow. When paired with Dremio Reflections, queries are further optimized by pre-aggregating and accelerating workloads, reducing compute costs while maintaining interactive performance at scale.

No More Data Silos – Query real-time and historical data together in a single environment, eliminating the need for duplicate data movement.

Lower Costs, Higher Efficiency – Reduce compute and storage costs by directly querying Iceberg tables without unnecessary data transformation.

Simplified Data Architecture – By bridging Kafka streams and the lakehouse, Tableflow and Dremio streamline complex data ingestion workflows.

Unlock the Power of Dremio & Tableflow Today!

Tableflow is currently in early access, and Dremio is working closely with Confluent to help organizations simplify real-time data analytics. If you’re interested in learning how Dremio and Tableflow can help your team seamlessly integrate operational and analytical data, sign up for a free trial today!

Sign up for AI Ready Data content

Understand How "Real-Time Analytics" Eliminates Data Silos and Simplifies Complex Data Challenges

Ready to Get Started?

Enable the business to accelerate AI and analytics with AI-ready data products – driven by unified data and autonomous performance.