h2h2h2

6 minute read · October 29, 2024

What’s New in Dremio 25.2: Expanding Lakehouse Catalog Support for Unmatched Flexibility and Governance

Maeve Donovan

Maeve Donovan · Senior Product Marketing Manager, Dremio

The data landscape is evolving at an unprecedented pace, and organizations are constantly seeking ways to maximize the value of their data while maintaining flexibility and control. Dremio 25.2 rises to meet these needs by expanding its support for lakehouse catalogs and metastores across all deployment models: on-premise, cloud, and hybrid. This release makes Dremio the only lakehouse provider capable of delivering this level of architectural flexibility, empowering customers with the freedom to choose the best catalog for their needs and deploy it wherever it’s most effective.

At the heart of this release is Dremio's commitment to openness and interoperability. Built on the open-source Project Nessie, the Dremio Iceberg Data Catalog ensures a flexible, future-proof solution that supports all Iceberg engines. This commitment extends to the integration of managed service catalogs from industry leaders like Snowflake and Databricks. By supporting Snowflake's Polaris Managed Service and Databricks' Unity Catalog Service, Dremio ensures a seamless and streamlined analytics experience across a diverse range of data environments.

A Deep Dive into the Key Features

Let's examine the key features of Dremio 25.2 and understand how they address the evolving needs of data-driven organizations.

Dremio Iceberg Data Catalog in Dremio Software (Private Preview)

Dremio's Iceberg Data Catalog, a built-in lakehouse catalog powered by Project Nessie, will be available as a Private Preview feature in Dremio Software version 25.2. This feature eliminates the need for customers to separately procure, provision, and maintain a lakehouse catalog. The Iceberg Data Catalog supports all features available to any Iceberg engine, including Dremio and Spark, and supports the Iceberg REST Catalog API. It also features data governance with role-based access control (RBAC) privileges and a built-in commit log, automated table maintenance through tasks like compaction and garbage collection, and data branching and versioning. With the addition of the Iceberg Data Catalog, Dremio Software can now be deployed via Helm chart as a full-stack lakehouse which includes a query engine, semantic layer, and the new built-in lakehouse catalog.

Previously practitioners had to use open-source catalogs that they had to provision themselves and that lacked robust governance and support coverage. The built-in catalog eliminates these pain points by providing a complete lakehouse platform within Dremio Software that includes a query engine, semantic layer, and lakehouse catalog. By providing a built-in lakehouse catalog, Dremio is making it easier for on-premises customers to adopt a lakehouse architecture.

Learn more in this blog.

Snowflake Open Catalog as a Source (Public Preview)

Snowflake Open Catalog as a Source, currently in Public Preview in Dremio 25.2, allows Dremio users to read Iceberg tables stored in Apache Polaris (incubating) with Snowflake’s cloud-managed service. This allows for seamless access to Iceberg data within the Snowflake ecosystem. The integration provides benefits such as expanding the range of Iceberg data accessible by Dremio, facilitating direct querying of data without requiring ETL processes, and ensuring consistent data governance by using the most current and trusted datasets from Apache Polaris (incubating).

Dremio leverages the best practices of Snowflake Open Catalog, simplifying connection and access to Iceberg tables. It allows users to read data from both internal and external Apache Polaris (incubating) catalogs, including those stored in AWS, Google, and Azure storage locations. Additionally, Dremio can utilize its internal capabilities, such as Reflections, on top of the Iceberg tables in Apache Polaris (incubating). 

Learn more in this blog.

Unity Catalog Service as a Source for Uniform Enabled Delta Tables (Public Preview)

The public preview of Unity Catalog Service as a Source in Dremio allows users to connect to and read data from Uniform-enabled Delta tables within Databricks' Unity Catalog. This integration leverages the Iceberg REST Spec to facilitate a stable connection for Iceberg clients like Dremio. This enables direct access to Iceberg data without requiring complex and time-consuming ETL pipelines. Unity Catalog offers centralized data governance and cataloging, while Uniform, a Databricks feature, generates Iceberg metadata for Delta tables, making them readable by Iceberg clients. 

This integration offers several benefits to Dremio users, including access to a wider range of Iceberg data regardless of its location. By connecting directly to the data source, it reduces the overhead associated with ETL processes, minimizing data latency and potential governance challenges. 

Learn more in this blog.

Dremio’s Commitment to Openness

Dremio is committed to giving customers the freedom to choose the best tools and infrastructure for their needs, reducing fears of vendor lock-in. By offering support for various catalogs, including open-source options like Project Nessie, Dremio ensures a flexible, interoperable, and future-proof solution for all users.

The Impact of Dremio 25.2

Dremio 25.2 is more than just a feature update; it's a strategic move towards creating a truly open, flexible, and unified lakehouse experience. The expanded support for lakehouse catalogs across all deployment models, coupled with integrations with prominent managed service catalogs, empowers organizations to build their lakehouse architecture on their terms.

This release significantly reduces the burden on data engineers by simplifying catalog deployment and maintenance, automating critical tasks, and providing comprehensive governance capabilities.

Dremio's unwavering commitment to open-source technologies like Project Nessie and Apache Iceberg ensures interoperability and future-proof solutions. By giving customers the freedom to choose the best tools and infrastructure for their needs, Dremio helps organizations navigate the ever-evolving data landscape with confidence.

Ready to Get Started?

Enable the business to create and consume data products powered by Apache Iceberg, accelerating AI and analytics initiatives and dramatically reducing costs.