Skip to main content

Writing files

Common examples for writing tabular data in Fused.

Geospatial data?

For geospatial formats (GeoParquet, GeoTIFF, etc.), see Writing Geospatial Data.

CSV

@fused.udf
def udf():
import pandas as pd
df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})

df.to_csv("s3://fused-users/fused/aman/my-data/output.csv", index=False) # change to your own path

return "Saved!"

Parquet

Parquet is the recommended format for tabular data - it's columnar, compressed, and fast.

@fused.udf
def udf():
import pandas as pd

path = "s3://fused-sample/demo_data/housing_2024.csv"
df = pd.read_csv(path)
df['price_per_area'] = round(df['price'] / df['area'], 2)

output_path = "s3://fused-users/fused/aman/housing_processed.parquet" # change to your own path
df.to_parquet(output_path)

return f"File saved to {output_path}"

JSON

@fused.udf
def udf():
import pandas as pd
df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})

df.to_json("s3://fused-users/fused/aman/my-data/output.json") # change to your own path

return "Saved!"

Where to Save Files

StoragePath FormatBest For
Fused home directorys3://fused-users/fused/aman/path/file.parquetPersistent storage, sharing
Mounted disk/mnt/cache/file.parquetTemporary files, caching
Your own S3s3://bucket/file.parquetEnterprise integration

See Cloud Storage for details on connecting your own buckets.