Dremio Blog: Various Insights
-
Dremio Blog: Various Insights
Comparison of Data Lake Table Formats (Apache Iceberg, Apache Hudi and Delta Lake)
Apache Iceberg, Apache Hudi, and Delta Lake: A Comparison of Data Lake Table Formats -
Dremio Blog: Various Insights
How to Migrate a Hive Table to an Iceberg Table
Learn how to architect a migration from your existing Hive tables into Apache Iceberg tables to take full advantage of features like Version Rollback, Partition Evolution and more. -
Dremio Blog: Various Insights
Maintaining Iceberg Tables – Compaction, Expiring Snapshots, and More
Learn about the strategies and best practices for maintaining Apache Iceberg tables. -
Dremio Blog: Various Insights
Apache Iceberg Version 0.13.0 Is Released
Apache Iceberg 0.13.0 released several new features and integrations with different platforms. -
Dremio Blog: Various Insights
An Introduction to Apache Arrow Flight SQL
About the newly announced Apache Arrow Flight SQL and why it matters. -
Dremio Blog: Various Insights
Will Apache Arrow Flight SQL replace ODBC and JDBC for Analytics/BI workloads?
Apache Arrow Flight SQL brings data access into the modern age. -
Dremio Blog: Various Insights
Project Nessie: Transactional Catalog for Data Lakes with Git-like semantics
Nessie does this with its branching functionality to track changes to multiple copies of the same data, and version control to track these changes over time so they can be merged back into production safely, consistent and atomically. -
Dremio Blog: Various Insights
Arrow Flight SQL: A Universal JDBC Driver
Arrow Flight SQL JDBC driver increases performance, and reduces the technical installation burden on applications and users -
Dremio Blog: Various Insights
What Is Apache Arrow?
Over the past few decades, databases and data analysis have changed dramatically. With these trends in mind, a clear opportunity emerged for a standard in-memory representation that every engine can use—one that’s modern, takes advantage of all the new performance strategies that are available, and makes sharing of data across platforms seamless and efficient. This […] -
Dremio Blog: Various Insights
Demystifying Cloud Data Lakes: A Comprehensive Guide
A cloud data lake is a cloud-hosted centralized repository that allows you to store all your structured and unstructured data at any scale, typically using an object store such as Amazon S3 or Microsoft Azure Data Lake Storage (ADLS). Its placement in the cloud means it can be interacted with as needed, whether it’s for […] -
Dremio Blog: Various Insights
Azure Storage Types and Use Cases
Azure Storage Types Azure Storage is a Microsoft-managed cloud service that provides storage that is highly available, secure, durable, scalable and redundant. Whether it is images, audio, video, logs, configuration files, or sensor data from an IoT array, data needs to be stored in a way that can be easily accessible for analysis purposes, and […] -
Dremio Blog: Various Insights
What Is Apache Iceberg?
Background on Data Within Data Lake Storage Data lakes are large repositories that store all structured and unstructured data at any scale. They are used to simplify data management by centralizing data and enabling all applications throughout an organization to interact on a shared data repository for all processing, analytics and reporting, significantly improving upon […] -
Dremio Blog: Various Insights
Nessie: Git for Data Lakes
The Rise of Data Lake Storage For decades organizations relied on relational databases, and later enterprise data warehouses, to organize and store corporate data. These systems provided a strong structural model to organize data as well as data consistency and reliability guarantees. However, these aspects were achieved by vertically integrated technology designs that were isolated […] -
Dremio Blog: Various Insights
What is a Data Lake?
A data lake is a centralized repository that allows you to store all of your structured and unstructured data at any scale. In the past, when disk storage was expensive, and data was costly and time-consuming to gather, enterprises needed to be discerning about what data to collect and store. Organizations would carefully design databases and data […] -
Dremio Blog: Various Insights
Data Lake vs Warehouse: Dremio Insights
While data lakes and data warehouses are conceptually different in terms of their design and implementation, they have at least a few things in common: However, this is usually where the similarities end. Before comparing data warehouses and data lakes, it is useful first to explain what we mean by data warehousing. What Is a Data Warehouse? Data warehouses […]