4 minute read · October 12, 2023

Tabular User-Defined Functions Unveiled

Albert Vernon

Albert Vernon · Senior Product Manager, Dremio

Introduction

User-defined functions (UDFs) are callable routines that make it easier for you to write and reuse SQL logic across queries. In addition, UDFs let you extend the capabilities of Dremio SQL, provide a layer of abstraction to simplify query construction, and encapsulate business logic. UDFs can also serve as row and column policies for access control.

Dremio Cloud and Dremio Software v24.1 introduce tabular UDFs, which take zero or more input parameters, run the specified query, and return a set of rows.

Note: Tabular UDFs are sometimes known as table-valued UDFs in other query engines.

The remainder of this article describes how to create and use tabular user-defined functions in Dremio.

Creating a Tabular User-Defined Function

You create a tabular UDF with the following syntax:

CREATE FUNCTION
    function_name ( [ function_parameter [, ...] ] )
    RETURNS { TABLE ( column_spec [, ...] ) }
    RETURN { query } 
function_parameter:
    parameter_name data_type
column_spec:
    column_name data_type

Unlike scalar UDFs, the RETURNS clause of a tabular UDF uses the TABLE keyword to list the names and data types of one or more columns.

To learn more, see the documentation for Dremio Cloud and Dremio Software.

Below is an example that returns only the red-colored entries from a table named fruits. Note the usage of the TABLE keyword in the FROM clause.

CREATE TABLE $scratch.fruits (name, color)
    AS VALUES ('strawberry', 'red'), ('banana', 'yellow'), ('raspberry', 'red');
CREATE FUNCTION red_fruits()
    RETURNS TABLE (name VARCHAR, color VARCHAR)
    RETURN SELECT * FROM $scratch.fruits WHERE color = 'red';
SELECT name FROM TABLE (red_fruits());
-- strawberry
-- raspberry

Parameterized Views

Tabular UDFs are typically used like parameterized views, which refers to passing a value to a view to control what data the view returns. Although Dremio does not have specific syntax for parameterized views, you can get the same outcome with a tabular UDF that takes one or more parameters and then uses those parameters in the function body.

By modifying the red_fruits example above to take a parameter named color, the UDF passes the parameter to the WHERE clause of the query, which provides the same behavior as a parameterized view. Note the usage of the fully qualified identifier fruits.color in the WHERE clause to distinguish between the column name and the parameter name.

CREATE TABLE $scratch.fruits (name, color)
    AS VALUES ('strawberry', 'red'), ('banana', 'yellow'), ('raspberry', 'red');
CREATE FUNCTION find_fruits(color VARCHAR)
    RETURNS TABLE (name VARCHAR, color VARCHAR)
    RETURN SELECT * FROM $scratch.fruits WHERE fruits.color = color;
SELECT name FROM TABLE (find_fruits('red'));
-- strawberry
-- raspberry
SELECT name FROM TABLE (find_fruits('yellow'));
-- banana

Get Started with Dremio Cloud – It’s Free!

Dremio Cloud: The easy and open, fully managed data lakehouse platform.

Sign Up Now

Everything you need to build, automate, and query your data lakehouse in production.

Ready to Get Started?

Enable the business to create and consume data products powered by Apache Iceberg, accelerating AI and analytics initiatives and dramatically reducing costs.