Cache decorator
Caching stores the result of slow function calls so they only need to run once. This persists objects across reruns and makes UDFs faster.
Basic
To cache a function decorate it with @fused.cache
.
@fused.udf
def udf():
import pandas as pd
@fused.cache
def load_data(i):
return pd.DataFrame({'id': [i]})
df_first = load_data(i=1)
df_second = load_data(i=2)
return pd.concat([df_first, df_second])
The first time Fused sees the function code and parameters, Fused runs the function and stores the return value in a cache. The next time the function is called with the same parameters and code, Fused skips running the function and returns the cached value.
Advanced
Pass bbox
to make the output unique to each Tile.
@fused.udf
def udf(bbox: fused.types.TileGDF=None):
@fused.cache
def fn(bbox):
return bbox
return fn(bbox)
Set a custom cache directory with the optional path
parameter.
@fused.udf
def udf():
import pandas as pd
@fused.cache(path='optional_cache_dir')
def fn():
return pd.DataFrame()
return fn()
Reset the cache by running the function once with reset=True
.
@fused.udf
def udf():
import pandas as pd
@fused.cache(reset=True)
def fn():
return pd.DataFrame()
return fn()