In this one of a kind article, Tomer Shiran - Dremio CEO, explains what are the key elements your IoT data pipeline.
This is a great takeaway:
“The basic pipeline scenario works well in small IoT environments where only hundreds of rows of measurements are captured daily, but in large industrial environments, where hundreds of gigabytes of data are generated every hour, the story is different. In such large-scale scenarios, we need to worry about how many copies of data there are, who has access to them, what managed services are required to maintain them and so on. Additionally, the traditional methods of accessing, curating, securing and accelerating data break down at scale.
Fortunately, there is an alternative: a governed, self-service environment where users can find and interact directly with their data, thus speeding time to insights and raising overall analytical productivity. Data lake engines provide ways for users to use standard SQL queries to search the data they want to work with without having to wait for IT to point them in the right direction. Additionally, data lake engines provide security measures that allow enterprises to have full control over what is happening to the data and who can access it, thus increasing trust in their data.”