Alex Merced is a Senior Tech Evangelist for Dremio, a developer, and a seasoned instructor with a rich professional background. Having worked with companies like GenEd Systems, Crossfield Digital, CampusGuard, and General Assembly.
Alex is a co-author of the O’Reilly Book “Apache Iceberg: The Definitive Guide.” With a deep understanding of the subject matter, Alex has shared his insights as a speaker at events including Data Day Texas, OSA Con, P99Conf and Data Council.
Driven by a profound passion for technology, Alex has been instrumental in disseminating his knowledge through various platforms. His tech content can be found in blogs, videos, and his podcasts, Datanation and Web Dev 101.
Moreover, Alex Merced has made contributions to the JavaScript and Python communities by developing a range of libraries. Notable examples include SencilloDB, CoquitoJS, and dremio-simple-query, among others.
As 2024 comes to a close, it’s clear that this year has been remarkable for the data lakehouse and the growing momentum driving its adoption. In this blog, I’ll reflect on some of the most exciting developments in the data lakehouse space, focusing on the new possibilities unlocked by tools like Apache Iceberg and Dremio. […]
The Output Welcome to the 2024 Football Playoffs Hackathon powered by Dremio. Teams from across the globe will apply their analytics prowess to predict: Each team must analyze current stats provided to support their selections with detailed insights. Judging criteria will include the accuracy of predictions, the quality of analysis, the clarity of visual presentation, […]
Dremio is a cutting-edge Lakehouse Platform designed to make data more accessible and actionable. With Apache Iceberg tables as first-class citizen, Dremio offers a powerful combination of data virtualization and unification capabilities. This means you can seamlessly combine data from databases, data warehouses, data lakes, and lakehouses into a single, governed platform. Dremio’s built-in semantic […]
Organizations face a common challenge: ensuring consistent and reliable data insights across multiple departments, tools, and teams. As data becomes increasingly central to decision-making, the need for a unified view—one everyone in the organization can rely on—has never been more critical. This is where a universal semantic layer comes into play. By creating a standardized […]
Modern enterprises are increasingly adopting data mesh architecture to keep up with demand for accessible, consistent data. Unlike traditional, centralized data models, data mesh prioritizes a decentralized approach, allowing individual teams to own and manage their own data domains. This structure enables organizations to achieve greater agility, faster access to data, and enhanced scalability. For […]
Organizations need rapid access to insights from their data to stay competitive. However, the complexity of managing data from diverse sources often slows down this process. Traditional methods like ETL (Extract, Transform, Load) are effective but can create delays due to data replication and movement. To overcome these challenges, data virtualization tools provide a robust […]
An Iceberg Data Lakehouse—a unified system that combines the scalability of data lakes with the analytical power of data warehouses—has emerged as a powerful solution to modern data requirements for performance, accessibility and costs. However, what makes this architecture effective is the strategic use of metadata to optimize performance, ensure data consistency, and enhance governance. […]
The demand for quick, actionable insights is higher than ever. Businesses are moving beyond traditional data warehouses to adopt lakehouses and other flexible data architectures that better support real-time analytics, BI, and AI applications. Dremio is at the forefront of this shift, providing a robust, high-performance hybrid lakehouse platform that enables fast, scalable analytics in […]
The rise of data lakehouses is transforming the way organizations manage, analyze, and leverage their data. Lakehouse architecture offers a flexible, scalable solution that bridges the gap between traditional data warehouses and data lakes. Apache Iceberg, an open table format designed to deliver reliable, high-performance analytics on large datasets, is at the heart of this […]
As organizations increasingly adopt hybrid data architectures, they often face challenges in accessing and analyzing data stored across cloud and on-premises environments. Databricks’ Unity Catalog offers a unified metastore that centralizes data management for cloud-based Delta Lake tables, enabling streamlined access to cloud data. At the same time, many companies retain valuable data in on-premises […]
Organizations often have a blend of cloud and on-premises data sources, creating a need for tools that can seamlessly bridge these environments. Dremio has introduced a new connector for Polaris Catalogs managed by Snowflake’s “Open Catalog” service, designed for Iceberg tables, and provides an open-source catalog solution for flexible data access and interoperability across cloud […]
Dremio has just rolled out version 25.2, and it’s bringing a feature many users have been eagerly waiting for – full dark mode across the entire platform. Whether you’re working late into the night or simply prefer the aesthetics and reduced eye strain that dark mode offers, this update brings a refreshing new way to […]
Organizations often have data distributed across cloud and on-premises environments, which poses significant integration challenges. Cloud-based platforms like Snowflake offer scalable, high-performance data warehousing capabilities, while on-premises systems like HDFS and Hive often store large volumes of legacy or sensitive data. Traditionally, analyzing data from these environments together would require complex data movement and transformation […]
Organizations are striving to build architectures that manage massive volumes of data and maximize the insights drawn from it. Traditional data architectures, however, often fail to handle the scale and complexity required. The modern answer lies in the data lakehouse, a hybrid approach combining the best aspects of data lakes and data warehouses. This blog […]
Flexibility and simplicity in managing metadata catalogs and storage solutions are key to efficient data platform management. Nessie’s REST Catalog Implementation brings this flexibility by centralizing table management across multiple environments in the cloud and on-prem, while PyIceberg provides an accessible Python implementation for interacting with Iceberg tables. In this blog, we’ll walk through setting […]
Join the Dremio/dbt community by joining the dbt slack community and joining the #db-dremio channel to meet other dremio-dbt users and seek support. Version control is a key aspect of modern data management, ensuring the smooth and reliable evolution of both your data and the code that generates insights from it. While code versioning has […]
The Apache Polaris (incubating) lakehouse catalog is the next step in the world of open lakehouses built on top of open community-run standards. While many other lakehouse catalogs are vendor-controlled or don’t enable full read-and-write support for Iceberg lakehouses, Polaris takes it a step further by being a community-run project integrating seamlessly with Apache Iceberg […]
Join the Dremio/dbt community by joining the dbt slack community and joining the #db-dremio channel to meet other dremio-dbt users and seek support. The ability to transform raw data into actionable insights is critical. As organizations scale, they need efficient ways to standardize, organize, and govern data transformations. This is where dbt (data build tool) […]
Operational analytic capabilities are foundational to delivering the personalized experiences that customers expect. While first and third-party market data are often the natural starting point, organizations are increasingly discovering that second-party data—the information exchanged securely and confidentially through partnerships and other strategic vendor relationships — are the differentiator that elevates the customer experience to new […]
Join the Dremio/dbt community by joining the dbt slack community and joining the #db-dremio channel to meet other dremio-dbt users and seek support. Maintaining a well-structured and version-controlled semantic layer is crucial for ensuring consistent and reliable data models. With Dremio’s robust semantic layer, organizations can achieve unified, self-service access to data, making analytics more […]
Efficiency and reliability are paramount when dealing with the orchestration of data pipelines. Whether you’re managing simple tasks or complex workflows across multiple systems, orchestration tools can make or break your data strategy. This blog will explore how orchestration plays a crucial role in automating and managing data processes, using tools like CRON and Apache […]
Introduction to the Hybrid Data Lakehouse Organizations are increasingly challenged to manage, store, and analyze vast data. While effective in the past, more than traditional data architectures is needed to meet the demands of modern data workloads, which require flexibility, scalability, and performance. This is where the concept of the hybrid data lakehouse comes into […]
In this tutorial, you’ll learn how to use Dremio’s Reflections to accelerate query performance. We’ll walk through the process of setting up a Dremio environment using Docker, connecting to sample datasets, running a complex query, and then using Reflections to significantly improve the query’s performance. Further Reading on Reflections Step 1: Spin Up a Dremio […]
Designing an optimal partitioning strategy for your data is often one of the most challenging aspects of building a scalable data platform. In traditional systems, data engineers frequently partition data by multiple columns, such as date and another frequently queried field. However, this can result in too many small files or partitions, which ultimately leads […]
Change Data Capture (CDC) is a design pattern used in databases and data processing to track and capture data changes—such as insertions, updates, and deletions—in real-time. Instead of periodically extracting entire datasets, CDC focuses on capturing only the data that has changed since the last update. This approach is crucial in modern data architectures, where […]
Enable the business to create and consume data products powered by Apache Iceberg, accelerating AI and analytics initiatives and dramatically reducing costs.