Skip to main content

Load & Export Data

Common examples for loading and saving data in Fused.

Load Data

pandas

@fused.udf
def udf(path: str = "s3://fused-sample/demo_data/housing_2024.csv"):
import pandas as pd

return pd.read_csv(path)

duckdb

@fused.udf
def udf(path: str = "s3://fused-sample/demo_data/housing_2024.parquet"):
import duckdb

conn = duckdb.connect()
result = conn.execute(f"""
SELECT *
FROM '{path}'
LIMIT 10
""").df()

return result

From other UDFs

@fused.udf
def udf(bounds: fused.types.Bounds):
overture_udf = fused.load('https://github.com/fusedio/udfs/tree/main/public/Overture_Maps_Example/')
buildings = fused.run(overture_udf, bounds=bounds, theme='buildings', overture_type='building')

return buildings

Download data to shared Fused mount

@fused.udf
def udf(url='https://www2.census.gov/geo/tiger/TIGER_RD18/STATE/11_DISTRICT_OF_COLUMBIA/11/tl_rd22_11_bg.zip'):
out_path = fused.download(url=url, file_path='out.zip')
return str(out_path)

Files will be written to /mount/tmp/, where any other UDF can then access them.

Read more about fused.download() here

Snowflake

Set user and password in the Fused secrets management UI first.

@fused.udf
def udf(query: str = 'SELECT CURRENT_VERSION()'):
import snowflake.connector
import pandas as pd

try:
conn = snowflake.connector.connect(
user=fused.secret('SNOWFLAKE_USER'),
password=fused.secret('SNOWFLAKE_PASSWORD'),
account='your_account_identifier',
warehouse='your_warehouse',
database='your_database',
schema='your_schema'
)

# Execute query and return as DataFrame
cursor = conn.cursor()
cursor.execute(query)

# Use pandas to read directly from cursor
df = cursor.fetch_pandas_all()

cursor.close()
conn.close()

return df

except Exception as e:
print(f"Snowflake connection failed: {e}")
raise

Read more about Snowflake's authentication

Export Data

Fused managed storage: fd://

df.to_parquet("fd://my-dataset/data.parquet")

Read more about fd:// S3 Bucket

Fused mount disk: /mnt/cache

df.to_parquet("/mnt/cache/data.parquet")

Read more about /mnt/cache mount disk

AWS S3: s3://

df.to_parquet("s3://my-bucket/data.parquet")

Google Cloud Storage: gcs://

df.to_parquet("gcs://my-bucket/data.parquet")

Use as API (No file saving required)

You can directly call your UDFs as APIs, removing the need to even save your data at all!

We create a Shared Token for your UDF the first time you save, so you can change the output format of your HTTPS endpoint:

https://fused.io/.../run/file?format=json

Tabular data downloads:

?format=csv          # CSV download
?format=geojson # GeoJSON download
?format=parquet # Parquet download
?format=json # JSON download
?format=mvt # Mapbox Vector Tile download

Image data downloads:

?format=png          # PNG image
?format=tiff # GeoTIFF download

Integrations

Call your UDFs from other tools after creating a Shared Token for your UDF:

DuckDB

select * from read_parquet('https://fused.io/.../run/file?');

Curl

curl -L -XGET 'https://fused.io/.../run/file?'

Google Sheets

Make sure your Sheets is set to "Viewer" for "Anyone with the link":

Setting Google Sheets to "Anyone with the link" can view

=importData('https://fused.io/.../run/file?')

Notion

  • Use /embed block with UDF endpoint: 'https://fused.io/.../run/file?'