4 minute read · November 27, 2023

New Array Functions in Dremio v24.3

Albert Vernon

Albert Vernon · Senior Product Manager, Dremio

Introduction

Data types in Dremio fall into two categories: primitive types such as INT and VARCHAR that hold single values, and semi-structured types like LIST, STRUCT, and MAP that hold complex values.

Arrays are lists of arbitrary size of any single type, indexed by non-negative integers, and are useful for holding sparse data.

Note: LIST and ARRAY are synonyms in Dremio, so you will see references to both in documentation, error messages, and function names.

Dremio Software v24.3 and the December 2023 update of Dremio Cloud introduce the functions below for manipulating array data.

New Array Functions

SignatureDescription
array_agg(expr)Returns an array consisting of all values in expr.
array_append(A, E)Returns a new array with E at the end of A.
array_distinct(A)Returns a new array with only the distinct elements from A.
arrays_overlap(X, Y)Returns whether X and Y have any elements in common.
array_prepend(E, A)Returns a new array with E at the beginning of A.
array_to_string(A, S)Returns A converted to a string by casting all values to strings and concatenating them using S to separate the elements.
set_union(X, Y, ...)Returns an array of all the distinct values contained in each array of the input.

Examples

SELECT ARRAY_AGG (x) FROM (VALUES (1), (2), (3)) AS foo (x);
-- [1,2,3]
SELECT ARRAY_APPEND(ARRAY[1, 2], 3);
-- [1,2,3]
SELECT ARRAY_DISTINCT(ARRAY[1, 1, 2, 2, 3]);
-- [2,3,1]
SELECT ARRAYS_OVERLAP(ARRAY[1, 2, 3], ARRAY[3, 4, 5]);
-- true
SELECT ARRAY_PREPEND(1, ARRAY[2, 3]);
-- [1,2,3]
SELECT ARRAY_TO_STRING(ARRAY[1, 2, 3], ':');
-- 1:2:3
SELECT SET_UNION(ARRAY[1, 1, 2], ARRAY[3, 3, 4]);
-- [2,3,4,1]

Coming Soon

The following function is planned for Dremio Software v25.0 and the February update of Dremio Cloud:

SignatureDescription
array_frequency(A)Returns a map where the keys are the unique elements in A, and the values are how many times the key appears.

Get Started with Dremio Cloud – It’s Free!

Dremio Cloud: The easy and open, fully managed data lakehouse platform.

Sign Up Now

Everything you need to build, automate, and query your data lakehouse in production.

Ready to Get Started?

Enable the business to create and consume data products powered by Apache Iceberg, accelerating AI and analytics initiatives and dramatically reducing costs.