Skip to main content

Cache decorator

Caching stores the result of slow function calls so they only need to run once. This persists objects across reruns and makes UDFs faster.

Basic

To cache a function decorate it with @fused.cache.

@fused.udf
def udf():
import pandas as pd

@fused.cache
def load_data(i):
return pd.DataFrame({'id': [i]})

df_first = load_data(i=1)
df_second = load_data(i=2)
return pd.concat([df_first, df_second])

The first time Fused sees the function code and parameters, Fused runs the function and stores the return value in a cache. The next time the function is called with the same parameters and code, Fused skips running the function and returns the cached value.

Advanced

Pass bbox to make the output unique to each Tile.

@fused.udf
def udf(bbox: fused.types.TileGDF=None):

@fused.cache
def fn(bbox):
return bbox

return fn(bbox)

Set a custom cache directory with the optional path parameter.

@fused.udf
def udf():
import pandas as pd

@fused.cache(path='optional_cache_dir')
def fn():
return pd.DataFrame()

return fn()

Reset the cache by running the function once with reset=True.

@fused.udf
def udf():
import pandas as pd

@fused.cache(reset=True)
def fn():
return pd.DataFrame()

return fn()