Skip to main content

Get started with Fused!

Use the Fused Python SDK to run a workflow in your favorite IDE, Jupyter Notebook, or Python environment.

Fast Track ⏱️

Understand Fused in 5 interactive minutes!

This quick tutorial introduces key concepts and basic features of Fused to help you get started with your workflows!

Concepts

Fused, as the name suggests, is a tool to fuse any data, with any tool, at any scale. At its core is the User Defined Function - the UDF. UDFs are the building blocks of composable geospatial operations. They are Python functions that Fused enhances with the ability to spatially filter access datasets of any size, run in a serverless cloud, and other benefits you'll learn about as you deepen in the ecosystem.

In this tutorial, you'll first install the Fused Python SDK and progressively write and run UDFs.

Install the Fused Python SDK.

# !pip install fused -q

Your first UDF

UDFs are Python functions that get superpowers with the @fused.udf decorator.

As a starting point, this cell shows how to declare a UDF and run it in your local environment with fused.run. Try running and modifying the UDF's code.

import fused

@fused.udf
def udf():
import pandas as pd
return pd.DataFrame({"answer": [42]})

fused.run(udf, engine='local')
Out:

42

Run a community UDF

It's also possible to load UDFs stored in a different environment, which makes it easy to reproduce and reuse code.

This cell uses fused.load to import a UDF from the UDF catalog GitHub repo.

import fused

udf = fused.load("https://github.com/fusedio/udfs/tree/main/public/DuckDB_NYC_Example")
gdf = fused.run(udf)
gdf.head()
lnglatcntgeometryfused_index
0-73.969740.8007161POINT (-73.96967 40.80067)0
1-73.98240.75731281POINT (-73.98200 40.75733)1
2-73.987340.7251452POINT (-73.98733 40.72500)2
3-73.99840.7203696POINT (-73.99800 40.72033)3
4-74.005340.74074268POINT (-74.00533 40.74067)4

Read data

Now, we need some data.

Fused runs any Python code, so you can leverage popular open source libraries like GeoPandas to interact with data.

This UDF shows an easy way to create a GeoDataFrame with census data hosted in a public catalog.

@fused.udf
def udf():
import geopandas as gpd

# Shape file as zip
url = "https://www2.census.gov/geo/tiger/TIGER_RD18/STATE/11_DISTRICT_OF_COLUMBIA/11/tl_rd22_11_bg.zip"
gdf = gpd.read_file(url)
return gdf


gdf = fused.run(udf=udf, engine="local")
gdf.head()
STATEFPCOUNTYFPTRACTCEBLKGRPCEGEOIDNAMELSADMTFCCFUNCSTATALANDAWATERINTPTLATINTPTLONgeometry
0111108001110010108001Block Group 1G5030S112810038.9006-77.0475POLYGON ((-77.05014 38.90033, -77.05013 38.900...
1111109002110010109002Block Group 2G5030S2270174293356638.8132-77.0238POLYGON ((-77.03919 38.80050, -77.03913 38.800...
211174011110010074011Block Group 1G5030S102905320098038.8668-76.9949POLYGON ((-77.00540 38.86879, -77.00341 38.870...
311174031110010074031Block Group 1G5030S126738038.8481-76.9774POLYGON ((-76.98127 38.84662, -76.98098 38.846...
411174041110010074041Block Group 1G5030S360630038.8515-76.9785POLYGON ((-76.98334 38.85337, -76.98277 38.853...

Basic operations

You can use Fused to conduct basic geospatial operations on any size dataset. Here are some examples.

@fused.udf
def udf():
import geopandas as gpd

# Shape file as zip
url = "https://www2.census.gov/geo/tiger/TIGER_RD18/STATE/11_DISTRICT_OF_COLUMBIA/11/tl_rd22_11_bg.zip"
gdf = gpd.read_file(url)

# Reproject Coordinate Reference System
gdf = gdf.to_crs(epsg=3857)

# Calculate areas
gdf["area"] = gdf.area

# Get centroid
gdf["centroid"] = gdf.area

# Calculate buffer
gdf["buffered"] = gdf.buffer(0.0001)

# Calculate convex hull
gdf["convex_hull"] = gdf.convex_hull

return gdf[['GEOID', 'area', 'centroid', 'buffered', 'convex_hull']].head()


gdf = fused.run(udf=udf, engine="local")
gdf.head()
GEOIDareacentroidbufferedconvex_hull
0110010108001186531186531POLYGON ((-8577181.905 4707405.025, -8577181.9...POLYGON ((-8577181.348 4707296.602, -8577181.9...
11100101090028.5829e+068.5829e+06POLYGON ((-8575963.513 4693134.492, -8575963.5...POLYGON ((-8575942.919 4691870.298, -8575963.5...
21100100740112.03183e+062.03183e+06POLYGON ((-8572201.582 4702894.684, -8572201.5...POLYGON ((-8571463.757 4701282.178, -8571905.0...
3110010074031209252209252POLYGON ((-8569516.111 4699724.528, -8569516.1...POLYGON ((-8569447.649 4699551.296, -8569522.6...
4110010074041595478595478POLYGON ((-8569745.986 4700689.222, -8569745.9...POLYGON ((-8569408.688 4699765.406, -8569745.9...
gdf.buffered.plot()

Spatial filtering ⚡

At this point, you might be asking: why would I use Fused to run basic spatial operations?

Fused gives your Python functions the ability to spatially filter dataset. It uses the bbox to perform a spatial query and return only the piece of data needed for the analysis.

When you run a UDF in parallel and on a serverless realtime engine, you suddenly have the ability to run simple Python code over datasets of any size. This takes you from code to map, instantly.

@fused.udf
def udf(bbox):
table_path="s3://fused-asset/infra/building_msft_us"
utils = fused.load("https://github.com/fusedio/udfs/tree/f928ee1/public/common/").utils
gdf = utils.table_to_tile(bbox, table=table_path)

# Calculate buffer
return gdf.buffer(0.0001)

gdf = fused.run(udf=udf, x=9648, y=12320, z=15, engine="local")
# gdf = fused.run(udf=udf, x=9648, y=12320, z=15, engine="realtime") # Running on the realtime engine requires authentication
gdf.head()
geometryfused_index
235POLYGON ((-73.99783 40.70978, -73.99785 40.709...0
239POLYGON ((-73.99870 40.71098, -73.99864 40.711...1
250POLYGON ((-74.00152 40.70789, -74.00182 40.708...2
253POLYGON ((-74.00438 40.70812, -74.00438 40.708...3
254POLYGON ((-74.00429 40.70665, -74.00391 40.706...4
gdf.plot()

What's next?

Congratulations! You're off to a great start with Fused. 🎉

With Fused you can do much more than what has been introduced so far, from loading data from Google Earth Engine and DuckDB, operations with rasters and vectors, to exporting data into Lonboard and Streamlit, and much more. Head over to the Python SDK documentation to learn more about what's possible and join the community on Discord.

Once you feel ready, we invite you to share the UDFs you create with the community!

Welcome aboard! 🚢