Writing Data
Common examples for writing tabular data in Fused.
Geospatial data?
For geospatial formats (GeoParquet, GeoTIFF, etc.), see Writing Geospatial Data.
CSV
@fused.udf
def udf():
import pandas as pd
df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
# Write to Fused managed storage
df.to_csv("fd://my-data/output.csv", index=False)
return "Saved!"
Parquet
Parquet is the recommended format for tabular data - it's columnar, compressed, and fast.
@fused.udf
def udf(path: str = "s3://fused-sample/demo_data/housing_2024.csv"):
import pandas as pd
df = pd.read_csv(path)
df['price_per_area'] = round(df['price'] / df['area'], 2)
# Save to your Fused bucket
username = fused.api.whoami()['handle']
output_path = f"fd://{username}/housing_processed.parquet"
df.to_parquet(output_path)
return f"File saved to {output_path}"
JSON
@fused.udf
def udf():
import pandas as pd
df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
df.to_json("fd://my-data/output.json")
return "Saved!"
Where to Save Files
| Storage | Path Format | Best For |
|---|---|---|
| Fused S3 bucket | fd://path/file.parquet | Persistent storage, sharing |
| Mounted disk | /mnt/cache/file.parquet | Temporary files, caching |
| Your own S3 | s3://bucket/file.parquet | Enterprise integration |
See Cloud Storage for details on connecting your own buckets.