Skip to main content

Running UDFs

Run a UDF and get results back.

Calling a UDF

UDFs are callable objects—call them directly like regular Python functions:

my_udf(
*args, # positional arguments
engine=None, # "remote" (default), "local", or instance type
cache_max_age=None, # max cache age, e.g. "48h", "10s"
cache=True, # set False to disable caching
**kwargs, # keyword arguments
)

See the Udf API reference for full details.

Loading a UDF

Use fused.load() to get a UDF object, then call it directly:

MethodSyntaxUse case
Your UDFfused.load("my_udf")UDFs you created
Team UDFfused.load("team/my_udf")Shared team UDFs
Public UDFfused.load("UDF_Name")Public UDFs (free)
Tokenfused.load("fsh_***")Share UDF without exposing code
Git commitfused.load("github.com/.../tree/{hash}/")Production stability
# Load and call
my_udf = fused.load("my_udf")
result = my_udf(name="hello")
Pin to commit hash for production
commit_hash = "bdfb4d0"
my_udf = fused.load(f"https://github.com/fusedio/udfs/tree/{commit_hash}/public/My_UDF/")
result = my_udf()

Avoid pointing to main branch—your UDF will change when others push to it.

Parameters

engine

Controls where your UDF runs:

ValueWhere it runsRAMTime limit
"remote" / "realtime" (default)Serverless instance~10GB120s
"local"Current process
"small"Dedicated machine2 GBNone
"medium"Dedicated machine64 GBNone
"large"Dedicated machine512 GBNone
AWS/GCP type (e.g. "m5.4xlarge")Dedicated machineVariesNone
result = my_udf(engine="medium")  # 64 GB RAM, no time limit

local contexts:

ContextWhat happens
Inside a UDFShares that UDF's compute (120s, ~10GB)
Inside a batch jobShares the batch instance resources
On your laptopRuns on local machine
Default engine

Set a default engine when defining a UDF:

@fused.udf(engine="large")
def my_udf():
...

In Workbench, UDFs with a batch engine will prompt for confirmation before running.

All AWS instance types
Instance TypevCPUsMemory (GB)
t3.small22
t3.medium24
t3.large28
t3.xlarge416
t3.2xlarge832
m5.large28
m5.xlarge416
m5.2xlarge832
m5.4xlarge1664
m5.8xlarge32128
m5.12xlarge48192
m5.16xlarge64256
r5.large216
r5.xlarge432
r5.2xlarge864
r5.4xlarge16128
r5.8xlarge32256
r5.12xlarge48384
r5.16xlarge64512
All GCP instance types
Instance TypevCPUsMemory (GB)
c2-standard-4416
c2-standard-8832
c2-standard-161664
c2-standard-3030120
c2-standard-6060240
m3-ultramem-3232976
m3-ultramem-64641,952

cache_max_age

Control how long results are cached. UDFs are cached for 90 days by default.

ValueMeaning
None (default)Follow @fused.udf() setting (90 days)
"0s"No caching
"10s", "48h", "1d"Cache for specified duration
result = my_udf(cache_max_age="1h")  # Cache for 1 hour
result = my_udf(cache=False) # Disable caching
tip

Set a default cache duration in the UDF decorator:

@fused.udf(cache_max_age="1d")
def my_udf():
...

Learn more about caching.

Passing arguments

Pass UDF parameters as keyword arguments:

@fused.udf
def my_udf(name: str, count: int = 1):
import pandas as pd
return pd.DataFrame({"name": [name] * count})

result = my_udf(name="hello", count=3)

Reserved parameters

These parameters control how Fused structures the bounds object for tile UDFs.

With x, y, z

overture_udf = fused.load("UDF_Overture_Maps_Example")
result = overture_udf(x=5241, y=12662, z=15)

With bounds as GeoDataFrame

import geopandas as gpd
bounds = gpd.GeoDataFrame.from_features({...})
result = overture_udf(bounds=bounds)

With bounds as bbox list

# [min_x, min_y, max_x, max_y]
result = overture_udf(bounds=[-122.349, 37.781, -122.341, 37.818])

See also


[Legacy]: fused.run()

The fused.run() function still works but direct UDF calling is preferred:

# Legacy
result = fused.run(my_udf, name="hello")

# Preferred
result = my_udf(name="hello")