Gnarly Data Waves

Workshop: 1

|

March 18, 2024

Getting Started with Dremio: Build a Data Lakehouse on your Laptop

Want to experience Data Lakehouse architecture? Join us and build a data lakehouse on your laptop in this exciting workshop.

Ready to revolutionize your data management approach and learn how to maximize your environment with Dremio?

Watch Alex Merced in this workshop where he’ll guide you step-by-step through building a lakehouse on your laptop with Dremio, Nessie and Minio. This is a great opportunity to try out many of the best features Dremio offers.

You’ll learn how to:

– Read and write Apache Iceberg tables on your object storage, cataloged by Nessie,

– Create views in the semantic layer

– And much more

GDW Community Edition Workshop Description:

This hands-on workshop, participants will embark on a journey to construct their very own data lakehouse platform using their laptops. The workshop is designed to introduce and guide participants through the setup and utilization of three pivotal tools in the data lakehouse architecture: Dremio, Nessie, and Apache Iceberg. Each of these tools plays a crucial role in enabling the flexibility of data lakes with the efficiency and ease of use of data warehouses aiming to simplify and economize data management.

You will start by setting up a Docker environment to run all necessary services, including a notebook server, Nessie for catalog tracking with Git-like versioning, Minio as an S3-compatible storage layer, and Dremio as the core lakehouse platform. The workshop will provide a practical, step-by-step guide to federating data sources, organizing and documenting data, and performing queries with Dremio; tracking table changes and branching with Nessie; and creating, querying, and managing Apache Iceberg tables for an ACID-compliant data lakehouse.

Prerequisites for the workshop include having Docker installed on your laptop. You will be taken through the process of creating a docker-compose file to spin up the required services, configuring Dremio to connect with Nessie and Minio, and finally, executing SQL queries to manipulate and query data within their lakehouse.

This immersive session aims to not just educate but to empower attendees with the knowledge and tools needed to experiment with and implement their data lakehouse solutions. By the end of the workshop, participants will have a functional data lakehouse environment on their laptops, enabling them to explore further and apply what they have learned to real-world scenarios. Whether you’re looking to improve your data management strategies or curious about the data lakehouse architecture, this workshop will provide a solid foundation and practical experience.

Additional Resources:

  1. [Github] Laptop Lakehouse Tutorial – nessie_dremio.md
  2. [Github ] Dremio Cloud Quality & Validations Examples – dremiocloudquality.md
  3. Laptop Lakehouse Tutorial
  4. Dremio Kubernetes and Helm Chart Directions
  5. Dremio REST API Reference
  6. Setting up a Dremio/Nessie Lakehouse on your Laptop for Evaluation in less than 10 minutes
  7. No Code Setup of a Data Lakehouse on your Laptop with Dremio & Minio using Docker Desktop
  8. Using DBT to manage Semantic Layer
  9. Using Dremio with Python

Watch or listen on your favorite platform

Register to view episode

Ready to Get Started? Here Are Some Resources to Help

Webinars

It’s Time To Consider a Hybrid Lakehouse Strategy

Discover the power of the hybrid lakehouse! Join data expert David Loshin to explore how this strategy combines the scalability of data lakes with the performance of data warehouses, enabling flexibility and future-proofing your data ecosystem.

read more

Webinars

Mastering Dremio’s Well-Architected Framework: Overview & Security: Overview and Security

Learn best practices for creating sustainable data architectures that emphasize efficiency and long-term maintainability. Discover strategies for optimizing workflows while exploring core design principles like security, cost optimization, and operational excellence. Ideal for professionals looking to enhance their data architecture skills!

read more

Webinars

AI-Ready Data with Data Products

As AI adoption rises, data quality and reliability are crucial. This presentation shows how treating data as a product—with clear ownership, quality standards, and governance—ensures AI readiness. Discover practical strategies to overcome challenges like accessibility and governance, turning data into a strategic asset for AI innovation.

read more
get started

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now
demo on demand

See Dremio in Action

Not ready to get started today? See the platform in action.

Watch Demo
talk expert

Talk to an Expert

Not sure where to start? Get your questions answered fast.

Contact Us

Ready to Get Started?

Enable the business to create and consume data products powered by Apache Iceberg, accelerating AI and analytics initiatives and dramatically reducing costs.