find_dataset
Find the dataset that contains a specific location URL.
fused.find_dataset(
location: str,
base_url: str | None = None
) -> dict[str, Any]
Uses hierarchical prefix matching: if the exact path isn't registered, searches progressively shorter prefixes to find the containing dataset.
Parameters
- location (
str) – Full URL to search for (s3://,gs://,http://, etc.). Can be a file, partition, or directory path. - base_url (
str | None) – Base URL for API. If None, uses current environment.
Returns
dict– Dataset dict with keys:id,location,description,storage_type,owner,public,created_at,updated_at, etc.
Raises
requests.HTTPError– If dataset not found (404) or request fails.
Example
import fused
# Find dataset containing a file
dataset = fused.find_dataset("s3://bucket/data/year=2024/file.parquet")
print(f"Found dataset: {dataset['location']}")
See also
fused.register_dataset– Register a new dataset