Skip to main content

Writing Data

Common examples for writing tabular data in Fused.

Geospatial data?

For geospatial formats (GeoParquet, GeoTIFF, etc.), see Writing Geospatial Data.

CSV

@fused.udf
def udf():
import pandas as pd

df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})

# Write to Fused managed storage
df.to_csv("fd://my-data/output.csv", index=False)

return "Saved!"

Parquet

Parquet is the recommended format for tabular data - it's columnar, compressed, and fast.

@fused.udf
def udf(path: str = "s3://fused-sample/demo_data/housing_2024.csv"):
import pandas as pd

df = pd.read_csv(path)
df['price_per_area'] = round(df['price'] / df['area'], 2)

# Save to your Fused bucket
username = fused.api.whoami()['handle']
output_path = f"fd://{username}/housing_processed.parquet"
df.to_parquet(output_path)

return f"File saved to {output_path}"

JSON

@fused.udf
def udf():
import pandas as pd

df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})

df.to_json("fd://my-data/output.json")

return "Saved!"

Where to Save Files

StoragePath FormatBest For
Fused S3 bucketfd://path/file.parquetPersistent storage, sharing
Mounted disk/mnt/cache/file.parquetTemporary files, caching
Your own S3s3://bucket/file.parquetEnterprise integration

See Cloud Storage for details on connecting your own buckets.