Skip to main content

File systems

Fused provides two file systems to make files accessible to all UDFs: an S3 bucket and a disk. Access is scoped at the organization level.

fd:// S3 bucket

Fused provisions a private S3 bucket namespace for your organization. It's ideal for large-scale, cloud-native, or globally accessible datasets, such as ingested tables, GeoTIFFs, and files that need to be read outside of Fused.

Use the File explorer to browse the bucket and see its full path.

File

Fused utility functions may reference it with the fd:// alias.

job = fused.ingest(
input="https://www2.census.gov/geo/tiger/TIGER_RD18/STATE/06_CALIFORNIA/06/tl_rd22_06_bg.zip",
output="fd://census/ca_bg_2022/",
).run_remote()

/mnt/cache disk

/mnt/cache is the path to a mounted disk to store files shared between UDFs. This is where @fused.cache and fused.download write data. It's ideal for files that UDFs need to read with low-latency, downloaded files, the output of cached functions, access keys, .env, and ML model weights.

UDFs may interact with the disk as with a local file system. For example, to list files in the directory:

@fused.udf
def udf():
import os
for each in os.listdir('/mnt/cache/'):
print(each)