Skip to main content

fused.api

Module Functions

The following functions can be called directly from the fused.api module:

import fused.api

fused.api.function_name()

whoami

whoami()

Returns information on the currently logged in user


delete

delete(path: str, max_deletion_depth: int | Literal['unlimited'] = 3) -> bool

Delete the files at the path.

Parameters:

  • path (str) – Directory or file to delete, like fd://my-old-table/
  • max_deletion_depth (int | Literal['unlimited']) – If set (defaults to 3), the maximum depth the operation will recurse to. This option is to help avoid accidentally deleting more data that intended. Pass "unlimited" for unlimited.

Examples:

fused.api.delete("fd://bucket-name/deprecated_table/")

list

list(path: str, *, details: bool = False) -> list[str] | list[ListDetails]

List the files at the path.

Parameters:

  • path (str) – Parent directory URL, like fd://bucket-name/
  • details (bool) – If True, return additional metadata about each record.

Returns:

  • list[str] | list[ListDetails] – A list of paths as URLs, or as metadata objects.

Examples:

fused.api.list("fd://bucket-name/")

get

get(path: str) -> bytes

Download the contents at the path to memory.

Parameters:

  • path (str) – URL to a file, like fd://bucket-name/file.parquet

Returns:

  • bytes – bytes of the file

Examples:

fused.api.get("fd://bucket-name/file.parquet")

download

download(path: str, local_path: str | Path) -> None

Download the contents at the path to disk.

Parameters:

  • path (str) – URL to a file, like fd://bucket-name/file.parquet
  • local_path (str | Path) – Path to a local file.

upload

upload(
local_path: str | Path | bytes | BinaryIO | pd.DataFrame | gpd.GeoDataFrame,
remote_path: str,
timeout: float | None = None,
) -> None

Upload local file to S3.

Parameters:

  • local_path (str | Path | bytes | BinaryIO | pd.DataFrame | gpd.GeoDataFrame) – Either a path to a local file (str, Path), a (Geo)DataFrame (which will get uploaded as Parquet file), or the contents to upload. Any string will be treated as a Path, if you wish to upload the contents of the string, first encode it: s.encode("utf-8")
  • remote_path (str) – URL to upload to, like fd://new-file.txt
  • timeout (float | None) – Optional timeout in seconds for the upload (will default to OPTIONS.request_timeout if not specified).

Examples:

To upload a local json file to your Fused-managed S3 bucket:

fused.api.upload("my_file.json", "fd://my_bucket/my_file.json")

sign_url

sign_url(path: str) -> str

Create a signed URL to access the path.

This function may not check that the file represented by the path exists.

Parameters:

  • path (str) – URL to a file, like fd://bucket-name/file.parquet

Returns:

  • str – HTTPS URL to access the file using signed access.

Examples:

fused.api.sign_url("fd://bucket-name/table_directory/file.parquet")

sign_url_prefix

sign_url_prefix(path: str) -> dict[str, str]

Create signed URLs to access all blobs under the path.

Parameters:

  • path (str) – URL to a prefix, like fd://bucket-name/some_directory/

Returns:

  • dict[str, str] – Dictionary mapping from blob store key to signed HTTPS URL.

Examples:

fused.api.sign_url_prefix("fd://bucket-name/table_directory/")

get_udfs

get_udfs(
n: int | None = None,
*,
skip: int = 0,
by: Literal["name", "id", "slug"] = "name",
whose: Literal["self", "public", "community", "team"] = "self"
) -> dict

Fetches a list of UDFs.

Parameters:

  • n (int | None) – The total number of UDFs to fetch. Defaults to All.
  • skip (int) – The number of UDFs to skip before starting to collect the result set. Defaults to 0.
  • by (Literal['name', 'id', 'slug']) – The attribute by which to sort the UDFs. Can be "name", "id", or "slug". Defaults to "name".
  • whose (Literal['self', 'public', 'community', 'team']) – Specifies whose UDFs to fetch. Can be "self" for the user's own UDFs or "public" for UDFs available publicly or "community" for all community UDFs. Defaults to "self".

Returns:

  • dict – A list of UDFs.

Examples:

Fetch UDFs under the user account:

fused.api.get_udfs()

job_get_logs

job_get_logs(job: CoerceableToJobId, since_ms: int | None = None) -> list[Any]

Fetch logs for a job

Parameters:

  • job (CoerceableToJobId) – the identifier of a job or a RunResponse object.
  • since_ms (int | None) – Timestamp, in milliseconds since epoch, to get logs for. Defaults to None for all logs.

Returns:

  • list[Any] – Log messages for the given job.

job_print_logs

job_print_logs(
job: CoerceableToJobId, since_ms: int | None = None, file: IO | None = None
) -> None

Fetch and print logs for a job

Parameters:

  • job (CoerceableToJobId) – the identifier of a job or a RunResponse object.
  • since_ms (int | None) – Timestamp, in milliseconds since epoch, to get logs for. Defaults to None for all logs.
  • file (IO | None) – Where to print logs to. Defaults to sys.stdout.

Returns:

  • None – None

job_tail_logs

job_tail_logs(
job: CoerceableToJobId,
refresh_seconds: float = 1,
sample_logs: bool = True,
timeout: float | None = None,
)

Continuously print logs for a job

Parameters:

  • job (CoerceableToJobId) – the identifier of a job or a RunResponse object.
  • refresh_seconds (float) – how frequently, in seconds, to check for new logs. Defaults to 1.
  • sample_logs (bool) – if true, print out only a sample of logs. Defaults to True.
  • timeout (float | None) – if not None, how long to continue tailing logs for. Defaults to None for indefinite.

job_get_status

job_get_status(job: CoerceableToJobId) -> RunResponse

Fetch the status of a running job

Parameters:

  • job (CoerceableToJobId) – the identifier of a job or a RunResponse object.

Returns:

  • RunResponse – The status of the given job.

job_cancel

job_cancel(job: CoerceableToJobId) -> RunResponse

Cancel an existing job

Parameters:

  • job (CoerceableToJobId) – the identifier of a job or a RunResponse object.

Returns:

  • RunResponse – A new job object.

job_get_exec_time

job_get_exec_time(job: CoerceableToJobId) -> timedelta

Determine the execution time of this job, using the logs.

Returns:

  • timedelta – Time the job took. If the job is in progress, time from first to last log message is returned.

job_wait_for_job

job_wait_for_job(
job: CoerceableToJobId,
poll_interval_seconds: float = 5,
timeout: float | None = None,
) -> RunResponse

Block the Python kernel until this job has finished

Parameters:

  • poll_interval_seconds (float) – How often (in seconds) to poll for status updates. Defaults to 5.
  • timeout (float | None) – The length of time in seconds to wait for the job. Defaults to None.

Raises:

  • TimeoutError – if waiting for the job timed out.

Returns:

  • RunResponse – The status of the given job.

access_token

access_token() -> str

Get an access token for the Fused service.

Returns the Bearer token for the authenticated user. Use this for authenticating API requests outside of the Fused SDK.

Returns:

  • str – The access token string.

logout

logout()

Log out the current user.

Deletes the credentials saved to disk and resets the global Fused API.


team_info

team_info() -> dict

Get information about the current user's team.

Returns:

  • dict – Team information including name, members, and settings.

schedule_udf

schedule_udf(
udf: BaseUdf | str,
minute: list[int] | int,
hour: list[int] | int,
day_of_month: list[int] | int | None = None,
month: list[int] | int | None = None,
day_of_week: list[int] | int | None = None,
udf_args: dict[str, Any] | None = None,
enabled: bool = True,
**kwargs
) -> CronJob

Schedule a UDF to run on a cron schedule.

Parameters:

  • udf (BaseUdf | str) – The UDF to schedule, either as a UDF object or name.
  • minute (list[int] | int) – Minute(s) to run (0-59).
  • hour (list[int] | int) – Hour(s) to run (0-23).
  • day_of_month (list[int] | int | None) – Day(s) of month to run (1-31).
  • month (list[int] | int | None) – Month(s) to run (1-12).
  • day_of_week (list[int] | int | None) – Day(s) of week to run (0-6, where 0 is Sunday).
  • udf_args (dict[str, Any] | None) – Arguments to pass to the UDF.
  • enabled (bool) – Whether the schedule is enabled. Defaults to True.

Returns:

  • CronJob – The created cron job object.

Example:

# Run a UDF every day at 8:00 AM
fused.api.schedule_udf(
udf="my_udf",
minute=0,
hour=8
)

# Run every Monday and Friday at noon
fused.api.schedule_udf(
udf=my_udf,
minute=0,
hour=12,
day_of_week=[1, 5]
)

schedule_list

schedule_list()

List all cron jobs scheduled for the current user.

Returns:

  • List of CronJob objects.

get_apps

get_apps(
n: int | None = None,
*,
skip: int = 0,
by: Literal['name', 'id', 'slug'] = 'name',
whose: Literal['self', 'public'] = 'self'
) -> dict

Get apps for the current user or public apps.

Parameters:

  • n (int | None) – Maximum number of apps to return. None for all.
  • skip (int) – Number of apps to skip.
  • by (Literal['name', 'id', 'slug']) – Sort order.
  • whose (Literal['self', 'public']) – Whose apps to get.

Returns:

  • dict – Dictionary of apps.

resolve

resolve(path: str) -> str

Resolve a path from fd:// to the full S3 URI.

Parameters:

  • path (str) – The path to resolve (e.g., fd://my-bucket/data/).

Returns:

  • str – The resolved S3 path.

Example:

s3_path = fused.api.resolve("fd://my-data/output.parquet")
# Returns: "s3://fused-users/my-team/my-data/output.parquet"

enable_gcs

enable_gcs()

Save GCS credentials from AWS secret manager into a temporary local file and set its path to the environment variable.

Use this to enable Google Cloud Storage access in your UDFs.


job_get_results

job_get_results(job: CoerceableToJobId) -> list[Any]

Get the results of a completed job.

Parameters:

  • job (CoerceableToJobId) – The identifier of a job or a RunResponse object.

Returns:

  • list[Any] – List of results from the job.

job_wait_for_results

job_wait_for_results(
job: CoerceableToJobId,
poll_interval_seconds: float = 5,
timeout: float | None = None
) -> list[UdfEvaluationResult]

Block until a job completes and return its results.

Combines job_wait_for_job and job_get_results into a single call.

Parameters:

  • job (CoerceableToJobId) – The identifier of a job or a RunResponse object.
  • poll_interval_seconds (float) – How often to poll for status. Defaults to 5.
  • timeout (float | None) – Maximum time to wait in seconds. None for no timeout.

Returns:

  • list[UdfEvaluationResult] – List of UDF evaluation results.

FusedAPI Class Methods

The following methods require creating a FusedAPI instance first:

from fused.api import FusedAPI
api = FusedAPI()
api.method_name()

FusedAPI

FusedAPI(
*,
base_url: str | None = None,
shared_udf_base_url: str | None = None,
set_global_api: bool = True,
credentials_needed: bool = True
)

API for running jobs in the Fused service.

Create a FusedAPI instance.

Other Parameters:

  • base_url (str | None) – The Fused instance to send requests to. Defaults to https://www.fused.io/server/v1.
  • shared_udf_base_url (str | None) – The shared UDF instance to send requests to. Defaults to https://www.udf.ai.
  • set_global_api (bool) – Set this as the global API object. Defaults to True.
  • credentials_needed (bool) – If True, automatically attempt to log in. Defaults to True.

create_udf_access_token

create_udf_access_token(
udf_email_or_name_or_id: str | None = None,
/,
udf_name: str | None = None,
*,
udf_email: str | None = None,
udf_id: str | None = None,
client_id: str | Ellipsis | None = ...,
public_read: bool | None = None,
access_scope: str | None = None,
cache: bool = True,
metadata_json: dict[str, Any] | None = None,
enabled: bool = True,
) -> UdfAccessToken

Create a token for running a UDF. The token allows anyone who has it to run the UDF, with the parameters they choose. The UDF will run under your environment.

The token does not allow running any other UDF on your account.

Parameters:

  • udf_email_or_name_or_id (str | None) – A UDF ID, email address (for use with udf_name), or UDF name.
  • udf_name (str | None) – The name of the UDF to create the token for.

Other Parameters:

  • udf_email (str | None) – The email of the user owning the UDF, or, if udf_name is None, the name of the UDF.
  • udf_id (str | None) – The backend ID of the UDF to create the token for.
  • client_id (str | Ellipsis | None) – If specified, overrides which realtime environment to run the UDF under.
  • cache (bool) – If True, UDF tiles will be cached.
  • metadata_json (dict[str, Any] | None) – Additional metadata to serve as part of the tiles metadata.json.
  • enabled (bool) – If True, the token can be used.

Usage: from fused.api import FusedAPI; api = FusedAPI(); api.create_udf_access_token()


upload

upload(
path: str,
data: bytes | BinaryIO,
client_id: str | None = None,
timeout: float | None = None,
) -> None

Upload a binary blob to a cloud location

Usage: from fused.api import FusedAPI; api = FusedAPI(); api.upload()


start_job

start_job(
config: JobConfig | JobStepConfig,
*,
instance_type: WHITELISTED_INSTANCE_TYPES | None = None,
region: str | None = None,
disk_size_gb: int | None = None,
additional_env: Sequence[str] | None = ("FUSED_CREDENTIAL_PROVIDER=ec2",),
image_name: str | None = None,
send_status_email: bool | None = None,
cache_max_age: int | None = None
) -> RunResponse

Execute an operation

Parameters:

  • config (JobConfig | JobStepConfig) – the configuration object to run in the job.

Other Parameters:

  • instance_type (WHITELISTED_INSTANCE_TYPES | None) – The AWS EC2 instance type to use for the job. Acceptable strings are "m5.large", "m5.xlarge", "m5.2xlarge", "m5.4xlarge", "r5.large", "r5.xlarge", "r5.2xlarge", "r5.4xlarge". Defaults to None.
  • region (str | None) – The AWS region in which to run. Defaults to None.
  • disk_size_gb (int | None) – The disk size to specify for the job. Defaults to None.
  • additional_env (Sequence[str] | None) – Any additional environment variables to be passed into the job, each in the form KEY=value. Defaults to None.
  • image_name (str | None) – Custom image name to run. Defaults to None for default image.
  • send_status_email (bool | None) – Whether to send a status email to the user when the job is complete.

Usage: from fused.api import FusedAPI; api = FusedAPI(); api.start_job()


get_jobs

get_jobs(
n: int = 5,
*,
skip: int = 0,
per_request: int = 25,
max_requests: int | None = 1
) -> Jobs

Get the job history.

Parameters:

  • n (int) – The number of jobs to fetch. Defaults to 5.

Other Parameters:

  • skip (int) – Where in the job history to begin. Defaults to 0, which retrieves the most recent job.
  • per_request (int) – Number of jobs per request to fetch. Defaults to 25.
  • max_requests (int | None) – Maximum number of requests to make. May be None to fetch all jobs. Defaults to 1.

Returns:

  • Jobs – The job history.

Usage: from fused.api import FusedAPI; api = FusedAPI(); api.get_jobs()


get_status

get_status(job: CoerceableToJobId) -> RunResponse

Fetch the status of a running job

Parameters:

  • job (CoerceableToJobId) – the identifier of a job or a RunResponse object.

Returns:

  • RunResponse – The status of the given job.

Usage: from fused.api import FusedAPI; api = FusedAPI(); api.get_status()


get_logs

get_logs(job: CoerceableToJobId, since_ms: int | None = None) -> list[Any]

Fetch logs for a job

Parameters:

  • job (CoerceableToJobId) – the identifier of a job or a RunResponse object.
  • since_ms (int | None) – Timestamp, in milliseconds since epoch, to get logs for. Defaults to None for all logs.

Returns:

  • list[Any] – Log messages for the given job.

Usage: from fused.api import FusedAPI; api = FusedAPI(); api.get_logs()


tail_logs

tail_logs(
job: CoerceableToJobId,
refresh_seconds: float = 1,
sample_logs: bool = False,
timeout: float | None = None,
)

Continuously print logs for a job

Parameters:

  • job (CoerceableToJobId) – the identifier of a job or a RunResponse object.
  • refresh_seconds (float) – how frequently, in seconds, to check for new logs. Defaults to 1.
  • sample_logs (bool) – if true, print out only a sample of logs. Defaults to False.
  • timeout (float | None) – if not None, how long to continue tailing logs for. Defaults to None for indefinite.

Usage: from fused.api import FusedAPI; api = FusedAPI(); api.tail_logs()


wait_for_job

wait_for_job(
job: CoerceableToJobId,
poll_interval_seconds: float = 5,
timeout: float | None = None,
) -> RunResponse

Block the Python kernel until the given job has finished

Parameters:

  • job (CoerceableToJobId) – the identifier of a job or a RunResponse object.
  • poll_interval_seconds (float) – How often (in seconds) to poll for status updates. Defaults to 5.
  • timeout (float | None) – The length of time in seconds to wait for the job. Defaults to None.

Raises:

  • TimeoutError – if waiting for the job timed out.

Returns:

  • RunResponse – The status of the given job.

Usage: from fused.api import FusedAPI; api = FusedAPI(); api.wait_for_job()


cancel_job

cancel_job(job: CoerceableToJobId) -> RunResponse

Cancel an existing job

Parameters:

  • job (CoerceableToJobId) – the identifier of a job or a RunResponse object.

Returns:

  • RunResponse – A new job object.

Usage: from fused.api import FusedAPI; api = FusedAPI(); api.cancel_job()


auth_token

auth_token() -> str

Returns the current user's Fused environment (team) auth token

Usage: from fused.api import FusedAPI; api = FusedAPI(); api.auth_token()