Skip to main content

Run UDFs in python

Run a UDF and get results back.

Signature

fused.run(
udf,
engine='remote',
instance_type='realtime',
cache_max_age=None,
max_retry=0,
)

Parameters

udf — Ways to reference a UDF

MethodSyntaxUse case
Your UDFfused.run("my_udf")UDFs you created
Teammate's UDFfused.run("teammate@fused.io/my_udf")UDFs from your team
Team UDFfused.run("team/my_udf")Shared team UDFs
Public UDFfused.run("UDF_Name")Public UDFs (free)
Tokenfused.run("fsh_***")Share UDF without exposing code
Git commitfused.run("github.com/.../tree/{hash}/")Production stability
Pin to commit hash for production
commit_hash = "bdfb4d0"
udf = fused.load(f"https://github.com/fusedio/udfs/tree/{commit_hash}/public/My_UDF/")
fused.run(udf)

Avoid pointing to main branch—your UDF will change when others push to it.

engine

EngineWhere it runsUse case
remote (default)Spins up new serverless instanceStandard usage
localCurrent processRun in existing compute

local contexts:

ContextWhat happens
Inside a UDFShares that UDF's compute (120s, ~4GB)
Inside a batch jobShares the batch instance resources
On your laptopRuns on local machine

instance_type

TypeRAMTime limitStartup
realtime (default)~4GB120s~5s
small2 GBNone~30s
medium64 GBNone~30s
large512 GBNone~30s
Default instance type

Set a default instance type when defining a UDF:

@fused.udf(instance_type="large")
def udf():
...

In Workbench, UDFs with a non-realtime instance type will prompt for confirmation before running as a batch job.

All AWS instance types
Instance TypevCPUsMemory (GB)
m5.large28
m5.xlarge416
m5.2xlarge832
m5.4xlarge1664
m5.8xlarge32128
m5.12xlarge48192
m5.16xlarge64256
r5.large216
r5.xlarge432
r5.2xlarge864
r5.4xlarge16128
r5.8xlarge32256
r5.12xlarge48384
r5.16xlarge64512
t3.small22
t3.medium24
t3.large28
t3.xlarge416
t3.2xlarge832
All GCP instance types
Instance TypevCPUsMemory (GB)
c2-standard-4416
c2-standard-8832
c2-standard-161664
c2-standard-3030120
c2-standard-6060240
m3-ultramem-3232976
m3-ultramem-64641,952

cache_max_age

Control how long results are cached. UDFs are cached for 90 days by default.

ValueMeaning
None (default)Follow @fused.udf() setting (90 days)
"0s"No caching
"10s", "48h", "1d"Cache for specified duration
tip

Set a default cache duration in the UDF decorator:

@fused.udf(cache_max_age="1d")
def udf():
...

See Caching for more details on how caching works.

sync

ValueBehavior
True (default)Blocking call, returns result
FalseReturns coroutine for async execution
# Async example
async def run_parallel():
tasks = [fused.run("my_udf", date=d, sync=False) for d in dates]
return await asyncio.gather(*tasks)
note

sync=False only works with engine='remote' and saved UDFs.

max_retry

Number of retries on failure. Default: 0

Passing arguments

Pass UDF parameters as keyword arguments:

@fused.udf
def udf(name: str, count: int = 1):
import pandas as pd
return pd.DataFrame({"name": [name] * count})
fused.run(udf, name="hello", count=3)

Reserved parameters

These parameters control how Fused structures the bounds object for tile UDFs.

With x, y, z

fused.run("UDF_Overture_Maps_Example", x=5241, y=12662, z=15)

With bounds as GeoDataFrame

import geopandas as gpd
bounds = gpd.GeoDataFrame.from_features({...})
fused.run("UDF_Overture_Maps_Example", bounds=bounds)

With bounds as bbox list

# [min_x, min_y, max_x, max_y]
fused.run("UDF_Overture_Maps_Example", bounds=[-122.349, 37.781, -122.341, 37.818])

See also