JobPool
JobPool
The JobPool
class is used to manage, inspect and retrieve results from
submitted jobs from fused.submit()
.
cancel
cancel(wait: bool = False)
Cancel any pending (not running) tasks.
Note it will not be possible to retry on the same JobPool later.
retry
retry()
Rerun any tasks in error or timeout states. Tasks are rerun in the same pool.
total_time
total_time(since_retry: bool = False) -> timedelta
Returns how long the entire job took.
If only partial results are available, returns based on the last task to have been completed.
times
times() -> list[Optional[timedelta]]
Time taken for each task.
Incomplete tasks will be reported as None.
done
done() -> bool
True if all tasks have finished, regardless of success or failure.
all_succeeded
all_succeeded() -> bool
True if all tasks finished with success
any_failed
any_failed() -> bool
True if any task finished with an error
any_succeeded
any_succeeded() -> bool
True if any task finished with success
arg_df
arg_df()
The arguments passed to runs as a DataFrame
status
status()
Return a Series indexed by status of task counts
wait
wait()
Wait until all jobs are finished
Use fused.options.show.enable_tqdm to enable/disable tqdm. Use pool._wait_sleep to set if sleep should occur while waiting.
tail
tail(stop_on_exception = False)
Wait until all jobs are finished, printing statuses as they become available.
This is useful for interactively watching for the state of the pool.
Use pool._wait_sleep to set if sleep should occur while waiting.
results
results(return_exceptions = False) -> List[Any]
Retrieve all results of the job.
Results are ordered by the order of the args list.
results_now
results_now(return_exceptions = False) -> Dict[int, Any]
Retrieve the results that are currently done.
Results are indexed by position in the args list.
df
df(
status_column: Optional[str] = "status",
result_column: Optional[str] = "result",
time_column: Optional[str] = "time",
logs_column: Optional[str] = "logs",
exception_column: Optional[str] = None,
include_exceptions: bool = True,
)
Get a DataFrame of results as they are currently.
The DataFrame will have columns for each argument passed, and columns for:
status
, result
, time
, logs
and optionally exception
.
get_status_df
get_status_df()
get_results_df
get_results_df(ignore_exceptions = False)
errors
errors() -> Dict[int, Exception]
Retrieve the results that are currently done and are errors.
Results are indexed by position in the args list.
first_error
first_error() -> Optional[Exception]
Retrieve the first (by order of arguments) error result, or None.
logs
logs() -> list[Optional[str]]
Logs for each task.
Incomplete tasks will be reported as None.
first_log
first_log() -> Optional[str]
Retrieve the first (by order of arguments) logs, or None.
success
success() -> Dict[int, Any]
Retrieve the results that are currently done and are successful.
Results are indexed by position in the args list.
pending
pending() -> Dict[int, Any]
Retrieve the arguments that are currently pending and not yet submitted.
running
running() -> Dict[int, Any]
Retrieve the results that are currently running.
cancelled
cancelled() -> Dict[int, Any]
Retrieve the arguments that were cancelled and not run.
collect
collect(ignore_exceptions = False, flatten = True)
Collect all results into a DataFrame