Skip to main content

Raster to H3: A Deep Dive

Transforming raster data to the H3 grid system is a powerful technique that offers significant advantages in geospatial data analysis and processing. This conversion process uses DuckDB to aggregate numpy arrays by H3 indices. It opens up new possibilities for efficient raster analysis.


Earth Observation imagery analysis

  • Agricultural parcels and field-level data
  • Global environmental
  • Land cover and land use change detection

Implementing Raster to H3

Implementation steps

  1. Load and chunk the raster into manageable parts
  2. Optionally coarsen the data to reduce resolution and speed up processing
  3. Bin the raster data to H3 indices based on points
  4. Aggregate the data by H3 indices

Example UDF

def udf(
tiff_path: str = "s3://fused-asset/gfc2020/JRC_GFC2020_V1_S10_W40.tif",
chunk_id: int = 0,
x_chunks: int = 20,
y_chunks: int = 40,
import geopandas as gpd
import pandas as pd
from shapely.geometry import box

utils = fused.load("").utils

df_tiff = utils.chunked_tiff_to_points(tiff_path, i=chunk_id, x_chunks=x_chunks, y_chunks=y_chunks)

qr = f"""
h3_latlng_to_cell(lat, lng, {h3_size}) AS hex,
AVG(lat) as lat, avg(lng) AS lng,
ARRAY_AGG(data) AS agg_data
FROM df_tiff
group by 1

df = utils.run_query(qr, return_arrow=True)
df = df.to_pandas()
df["agg_data"] = x: pd.Series(x).sum())
df["hex"] = df["hex"].map(lambda x: hex(x)[2:])
df["metric"] = df.agg_data / df.agg_data.max() * 100
gdf = utils.df_to_gdf(df)
return gdf

Demo app [beta]