Skip to main content

find_dataset

Find the dataset that contains a specific location URL.

fused.find_dataset(
location: str,
base_url: str | None = None
) -> dict[str, Any]

Uses hierarchical prefix matching: if the exact path isn't registered, searches progressively shorter prefixes to find the containing dataset.

Parameters

  • location (str) – Full URL to search for (s3://, gs://, http://, etc.). Can be a file, partition, or directory path.
  • base_url (str | None) – Base URL for API. If None, uses current environment.

Returns

  • dict – Dataset dict with keys: id, location, description, storage_type, owner, public, created_at, updated_at, etc.

Raises

  • requests.HTTPError – If dataset not found (404) or request fails.

Example

import fused

# Find dataset containing a file
dataset = fused.find_dataset("s3://bucket/data/year=2024/file.parquet")
print(f"Found dataset: {dataset['location']}")

See also