Over the past few years, the industry has increasingly recognized the need to adopt a data lakehouse architecture because of the inherent benefits. This approach improves data infrastructure costs and reduces time-to-insight by consolidating more data workloads into a single source of truth on the organization’s data lake. This is made possible by data lakehouse table formats like Apache Iceberg, Apache Hudi, and Delta Lake, which enable database-like tables on a data lake. These formats support ACID (atomicity, consistency, isolation, durability) transactions, schema evolution, and other features that replicate the functionality of data warehouses without the restrictions of a walled garden. Even traditional data warehouse platforms have adapted their data processing tools to work with these tables on data lakes.
Among the three table formats, Apache Iceberg has experienced a surge in popularity, as many companies have fully embraced Apache Iceberg as the standard format for the data lakehouse. This momentum has been so significant that Databricks – the creator of Apache Iceberg’s main competitor, Delta Lake – acquired Tabular, a company founded by the initial creators of Apache Iceberg at Netflix, which offers an enterprise catalog service for Iceberg tables. This development, along with the announcement of the open-source catalog, Polaris from Snowflake, signals that Apache Iceberg has achieved overwhelming support, encouraging many companies to confidently build data lakehouses on Apache Iceberg.
Read the full article, via Dataversity.