Write UDFs
Follow these steps to write a User Defined Function (UDF).
- Decorate a function with
@fused.udf
- Declare the function logic
- Optionally cache parts of the function
- Set typed parameters to dynamically run based on inputs
- Import utility modules to keep your code organized
- Return a vector table or raster
- Save the UDF
@fused.udf
decorator
First decorate a Python function with @fused.udf
to tell Fused to treat it as a UDF.
Function declaration
Next, structure the UDF's code. Declare import statements within the function body, express operations to load and transform data, and define a return statement. This UDF is called udf
and returns a pd.DataFrame
object.
@fused.udf # <- Fused decorator
def udf(name: str = "Fused"): # <- Function declaration
import pandas as pd
return pd.DataFrame({'message': [f'Hello {name}!']})
The UDF Builder in Workbench imports the fused
module automatically. To write UDFs outside of Workbench, install the Fused Python SDK with pip install fused
and import it with import fused
.
Placing import statements within a UDF function body (known as "local imports") is not a common Python practice, but there are specific reasons to do this when constructing UDFs. UDFs are distributed to servers as a self-contained units, and each unit needs to import all modules it needs for its execution. UDFs may be executed across many servers (10s, 100s, 1000s), and any time lost to importing unused modules will be multiplied.
An exception to this convention is for modules used for function annotation, which need to be imported outside of the function being annotated.
@fused.cache
decorator
Use the @fused.cache decorator to persist a function's output across runs so UDFs run faster.
@fused.udf # <- Fused decorator
def udf(bounds: fused.types.Bounds = None, name: str = "Fused"):
import pandas as pd
@fused.cache # <- Cache decorator
def structure_output(name):
return pd.DataFrame({'message': [f'Hello {name}!']})
df = structure_output(name)
return df
Typed parameters
UDFs resolve input parameters to the types specified in their function annotations.
This example shows the bounds
parameter typed as fused.types.Bounds
and name
as a string.
@fused.udf
def udf(
bounds: fused.types.Bounds = None, # <- Typed parameters
name: str = "Fused"
):
To write UDFs that run successfully as both File
and Tile
, set bounds
as the first parameter, with None
as its default value. This enables the UDF to be invoked successfully both as File
(when bounds
isn't passed) and as Tile
. For example:
@fused.udf
def udf(bounds: fused.types.Bounds = None):
...
return ...
Supported types
Fused supports the native Python types int
, float
, bool
, list
, dict
, and list
. Parameters without a specified type are handled as strings by default.
The UDF Builder runs the UDF as a Map Tile if the first parameter is typed as fused.types.Bounds
.
pd.DataFrame
as JSON
Pass tables and geometries as serialized UDF parameters in HTTPS calls. Serialized JSON and GeoJSON parameters can be casted as a pd.DataFrame
or gpd.GeoDataFrame
. Note that while Fused requires import statements to be declared within the UDF signature, libraries used for typing must be imported at the top of the file.
import geopandas as gpd
import pandas as pd
@fused.udf
def udf(
gdf: gpd.GeoDataFrame = None,
df: pd.DataFrame = None
):
Reserved parameters
When running a UDF with fused.run
, it's possible to specify the map tile Fused will use to structure the bounds
object by using the following reserved parameters.
With x
, y
, z
parameters
fused.run("UDF_Overture_Maps_Example", x=5241, y=12662, z=15)