register_dataset
Register a dataset for indexed queries.
fused.register_dataset(
dataset_path: str,
base_url: str | None = None
) -> dict[str, Any]
This function registers a directory in your file storage as a dataset, enabling fast geospatial queries using H3 indexing.
Parameters
- dataset_path (
str) – Path to the dataset directory. The path should point to a directory containing parquet files. - base_url (
str | None) – Base URL for API. If None, uses current environment.
Returns
dict– Dictionary with registration results:dataset_id: ID of the created/updated datasetlocation: Normalized URL of the datasetvisit_status: Status of the dataset visit (success/timeout/error)items_discovered: Total number of items foundnew_items: Number of new items added
Raises
requests.HTTPError– If the API request fails.
Example
import fused
# Register a dataset from your storage
result = fused.register_dataset("s3://my-bucket/my-data/buildings/")
print(f"Registered dataset with ID: {result['dataset_id']}")
print(f"Found {result['items_discovered']} files")
note
- Regular users can use any storage paths they have access to
- Datasets are registered as private (only accessible to your team)
- Files are automatically queued for metadata extraction
See also
fused.find_dataset– Find a registered dataset