Skip to main content

Point in Polygon: A Guide With Overture & NSI

Point in polygon operations ask whether a given set of points in a map falls within a polygon. This technique is commonly used to enrich data from one dataset with information from another.

This walkthrough shows how to enrich Overture Maps building polygons with hazard metadata from the National Structure Inventory (NSI) using point in polygon analysis. The Overture Maps buildings footprints is a Polygon dataset while NSI is a Point dataset with metadata related to natural and man-made hazards. The two can be joined to visualize buildings at risk from hazards.

Furthermore, features in the Overture dataset are assigned Global Entity Reference System (GERS) IDs, which is a universal reference framework to match features across datasets. Enriching features with Overture GERS IDS is a common operation that enables joining datasets.

Building Polygon objects colored based on NSI values.

Applications

  • Disaster risk assessment (e.g. flood or fire)
  • Insurance risk modeling
  • Urban planning and zoning
  • Emergency response and evacuation planning

Implementing point in polygon analysis

Implementation steps

  1. Load data for each dataset as a GeoDataFrame for a spatially filtered subregion.
  2. Use the GeoPandas method sjoin to determine which points lie within each polygon.
  3. Structure the output table.

Example UDF

This UDF loads the Overture and NSI datasets (with pre-defined subsets of columns), performs a spatial join to enrich the building polygons with hazard metadata, and returns the enriched GeoDataFrame. It can be used within a larger analysis workflow to enrich building polygons to calculate risk indices.

@fused.udf
def udf(bbox: fused.types.TileGDF=None):
# Import utility functions
utils = fused.load("https://github.com/fusedio/udfs/blob/main/public/Nsi_Overture/").utils

# Load both datasets
gdf_overture = utils.load_overture_gdf(bbox, overture_type = "building")
gdf_nsi = utils.load_nsi_gdf(bbox)

# Spatial join - keeps the polygon geometry
gdf_enriched_buildings = gdf_overture.sjoin(gdf_nsi)

# Drop cases where two points fall within one polygon
gdf_enriched_buildings.drop_duplicates(subset='id', keep='first', inplace=True, ignore_index=True)
return gdf_enriched_buildings

The UDF would return a GeoDataFrame with the Overture building geometry and the selected NSI columns:

┌─────────┬──────────────────────┬─────────────────────────────────────────────────────────────────────────────────────────┬─────────────────────────────────────────────────────────┬───────────┬─────────┬────────────┬────────────┬────────────┬───────────┐
│ column0 │ id │ geometry │ names │ st_damcat │ occtype │ med_yr_blt │ val_struct │ val_cont │ val_vehic │
│ int64 │ varchar │ varchar │ varchar │ varchar │ varchar │ int64 │ double │ double │ double │
├─────────┼──────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────┼───────────┼─────────┼────────────┼────────────┼────────────┼───────────┤
│ 0 │ 08b489e34622dfff02… │ POLYGON ((-97.7532121 30.2694096, -97.7532879 30.2692591, -97.7533541 30.2691264, -97… │ {'primary': 'The Bowie', 'common': None, 'rules': None} │ COM │ COM4 │ 2003 │ 963089.123 │ 963089.123 │ 27000.0 │
│ 1 │ 08b489e346229fff02… │ POLYGON ((-97.7530941 30.2695412, -97.7530164 30.2695137, -97.7529916 30.269566, -97.… │ │ COM │ COM8 │ 2003 │ 228292.24 │ 228292.24 │ 45000.0 │
└─────────┴──────────────────────┴─────────────────────────────────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────┴───────────┴─────────┴────────────┴────────────┴────────────┴───────────┘

Conclusion

Point-in-polygon is a powerful technique to enrich datasets. By combining the detailed building footprints from Overture Maps with the comprehensive hazard and value information from the National Structure Inventory, we can generate rich, informative risk maps and indices.

However, it is important to consider the limitations and potential complexities of the data, such as quality and completeness, temporal alignment between datasets, scale and resolution differences, and cases where multiple points fall within a single polygon.

Demo app [beta]