Gnarly Data Waves

Episode 38

|

October 24, 2023

Building a Data Science Platform on Apache Iceberg and Nessie

Discover the future of data science and machine learning pipelines with Jacopo Tagliabue of Bauplan Labs in this webinar. Learn why modern data platforms are embracing Apache Iceberg and Nessie, and explore the transformative benefits of Nessie's git-like features for data management.

Join us for an insightful webinar featuring Jacopo Tagliabue of Bauplan Labs as he dives into the world of data science and machine learning pipelines. In this session, you’ll discover the rationale behind Bauplan Labs’ choice of open-source technologies, such as Apache Iceberg table format and Project Nessie transactional data catalog, for their cutting-edge platform. Gain valuable insights into why modern data platforms are increasingly adopting these technologies and how Nessie’s git-like features can revolutionize your data management. Don’t miss out on this opportunity to stay ahead in the world of data science and technology!

About Project Nessie – Introducing Nessie as a Dremio Source

Learn:

– Why Modern Data Platforms are being built on Apache Iceberg

– Why Modern Data Platforms are being built on Nessie

Watch or listen on your favorite platform

Register to view episode

Speakers

Alex Merced

Alex Merced

Alex Merced is a Senior Tech Evangelist for Dremio, a developer, and a seasoned instructor with a rich professional background. Having worked with companies like GenEd Systems, Crossfield Digital, CampusGuard, and General Assembly.

Alex is a co-author of the O’Reilly Book “Apache Iceberg: The Definitive Guide.”  With a deep understanding of the subject matter, Alex has shared his insights as a speaker at events including Data Day Texas, OSA Con, P99Conf and Data Council.

Driven by a profound passion for technology, Alex has been instrumental in disseminating his knowledge through various platforms. His tech content can be found in blogs, videos, and his podcasts, Datanation and Web Dev 101.

Moreover, Alex Merced has made contributions to the JavaScript and Python communities by developing a range of libraries. Notable examples include SencilloDB, CoquitoJS, and dremio-simple-query, among others.

Jacopo Tagliabue

Jacopo Tagliabue

Jacopo Tagliabue is the Bauplan Labs founder and educated in several acronyms across the globe (UNISR, SFI, MIT), he was co-founder and CTO of Tooso. Tooso was proudly serving predictions to millions of shoppers, before being acquired by Coveo (TSX:CVO).

He led Coveo’s A.I. and MLOps roadmap from scale-up to IPO, and built out Coveo Labs, an agile, applied R&D practice rooted in word-class collaborations (Stanford, Bocconi, Outerbounds, Uber, Microsoft, NVIDIA), open source and open science.

He talk *a lot*, and I’m often invited to do so by folks in industry (BBC, Walmart, Pinterest, eBay, Meta, Farfetch) and academia (SIRIP, CiE, KDD, Stanford, Harvard); He is currently an Adj. Professor of ML at NYU, which is mostly notable because it is the only job I ever had that my parents (sort of) understand.

His A.I. work has been featured several times in the general press and presented in business and academic venues (including WWW, RecSys, NAACL, as well as winning best paper at NAACL21).

In previous lives, he managed to do scienc-y things for a professional basketball team, simulate a pre-Columbian civilization and give an academic talk on videogames (among others improbable “achievements”).

Ready to Get Started? Here Are Some Resources to Help

Infographics Thumb

Infographic

Quick Guide to the Apache Iceberg Lakehouse

read more
AnalystReports Thumb

Analyst Report

It’s Time to Consider a Hybrid Lakehouse Strategy

read more
CaseStudies Thumb

Case Study

Navigating the Data Mesh Journey: Lessons from Scania’s Implementation

read more
get started

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now
demo on demand

See Dremio in Action

Not ready to get started today? See the platform in action.

Watch Demo
talk expert

Talk to an Expert

Not sure where to start? Get your questions answered fast.

Contact Us

Ready to Get Started?

Enable the business to create and consume data products powered by Apache Iceberg, accelerating AI and analytics initiatives and dramatically reducing costs.