Panel: Open Data Architecture

The Open Data Architecture panel closes Subsurface LIVE Summer 2021 with a lively discussion about the state of the cloud data lake with some of the most influential creators and contributors to key open source data lake software. Moderator Gartner analyst Sanjeev Mohan opens the panel with highlights of recent cloud data lake industry trends. Then he asks the open source panelists questions such as:

  1. Why data lakes have not, until now, succeeded in having the fast turnaround of data that was expected;
  2. What missteps have there been along the way and what lessons have we learned from them;
  3. What each contributors journey has been, and why they have succeeded.

Topics Covered

Apache Arrow Flight
Dremio Subsurface for Apache Arrow
Dremio Subsurface for Apache Parquet
In-Memory Formats
Metastores
Subsurface: Nessie Project Insights
Table Formats
Unlocking Potential with Apache Iceberg

Speakers

Sanjeev Mohan Dremio Author & Contributor

Sanjeev Mohan

Mohan Sanjeev is an established thought leader in the areas of cloud, big data and analytics. He researches and advises on changing trends and technologies in the modern cloud data architectures. He started his data and analytics journey at Oracle where he worked on emerging technologies. Until recently, he was a Gartner vice president known for his prolific and detailed research, and for directing the data and analytics agenda. Now a Principal at SanjMo, he provides advisory and consulting services, covering modern data architectures, governance and operations. He regularly presents on topics pertaining to end-to-end data pipelines and is excited to help businesses discover what their data can do for them.

Ryan Blue Dremio Author & Contributor

Ryan Blue

Ryan Blue is the co-creator of Apache Iceberg, and he works on open source data infrastructure. He is also an Avro, Parquet, and Spark committer.

Ryan Murray Dremio Author & Contributor

Ryan Murray

Ryan Murray is an open source Engineering Lead at Dremio. He previously served in the financial services industry doing everything from bond trader to data engineering lead. Ryan holds a PhD in theoretical physics and is an active open source contributor who dislikes it when data isn’t accessible in an organisation. He is passionate about making customers successful and self-sufficient, and still one day dreams of winning the Stanley Cup.

Julien Le Dem Dremio Author & Contributor

Julien Le Dem

Julien Le Dem is the Chief Architect of Astronomer and Co-Founder of Datakin. He co-created Apache Parquet and is involved in several open source projects including OpenLineage, Marquez (LFAI&Data), Apache Arrow, Apache Iceberg, and others. Previously, he was a senior principal at WeWork, a principal architect at Dremio, a tech lead for Twitter’s data processing tools, where he obtained a two-character Twitter handle (@J_), and a principal engineer and tech lead working on content platforms at Yahoo, where he received his Hadoop initiation. His French accent makes his talks particularly attractive.

Wes Mckinney Dremio Author & Contributor

Wes McKinney

Wes McKinney is a software developer and entrepreneur focusing on analytical computing. He created the Python pandas project and is a co-creator of Apache Arrow. He authored two editions of the reference book, Python for Data Analysis. Wes is a member of The Apache Software Foundation and also a PMC member for Apache Parquet. He is now the CTO and co-founder of Voltron Data, a new startup working on accelerated computing technologies powered by Apache Arrow.

get started

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now
demo on demand

See Dremio in Action

Not ready to get started today? See the platform in action.

Watch Demo
talk expert

Talk to an Expert

Not sure where to start? Get your questions answered fast.

Contact Us

Ready to Get Started?

Enable the business to create and consume data products powered by Apache Iceberg, accelerating AI and analytics initiatives and dramatically reducing costs.