OSMGraphCLIP-MS-L10

A pretrained location encoder from the OSMGraphCLIP framework. It maps geographic coordinates (longitude, latitude) to dense vector embeddings that capture the semantic character of a location — its land use, built environment, road network, and landscape context — learned from freely available OpenStreetMap data.

This is the MS-L10 variant: multiscale spherical-harmonic bands with Legendre polynomial degree 10.

Model description

OSMGraphCLIP trains a CLIP-style contrastive model that aligns two views of a location:

Graph encoder (OSMHeteroGAT): processes a heterogeneous OSM graph (points, lines, polygons — roads, buildings, land use, POIs) centered at the location, using SBERT node features.
Location encoder: maps geographic coordinates through spherical-harmonic positional encodings and a SIREN network.

Symmetric cross-entropy loss aligns matching graph–coordinate pairs into a shared embedding space. After training, the location encoder alone is sufficient for inference — no OSM data is needed at query time. The graph encoder is only used during training.

Other pretrained variants (MS-L40, A-L40, A-L10) are available in the GitHub repository.

Intended uses

Geographic/geospatial representation learning
Downstream prediction tasks: climate, ecology, socioeconomics, public health, land cover, biodiversity, wildfire forecasting
Location-conditioned retrieval or similarity search
Any task that benefits from a semantically rich, globally consistent coordinate embedding

Training data

Approximately 200,000 globally-diverse locations sampled from:

satclip_locations.csv — primary location set
h3_locations.csv — H3-sampled globally-uniform locations

For each location, an OSM graph was fetched and used as the graph encoder's input during training.

How to use

Install the package from the GitHub repository:

pip install git+https://github.com/d-michail/osmgraphclip.git

Python API:

import torch
from osmgraphclip.load import get_osmgraphclip_from_hf

# Load the location encoder (no OSM data needed at inference)
location_encoder = get_osmgraphclip_from_hf("osmgraphclip-ms-l10", device="cpu")

# coords: tensor of shape (N, 2) in (lon, lat) order
coords = torch.tensor([[13.40, 52.52]])   # Berlin
embedding = location_encoder(coords)      # (N, D)

Command-line:

python infer.py --hf-model osmgraphclip-ms-l10 --lat 52.52 --lon 13.40

Note: coordinates must be provided in (longitude, latitude) order.

Evaluation

On a suite of downstream geospatial tasks (climate, ecology, socioeconomics, public health, land cover, biodiversity, wildfire forecasting), OSMGraphCLIP performs competitively with or surpasses satellite-imagery baselines. It shows particular strength on socioeconomic and public health tasks, where OSM's semantic annotations of the human-built environment offer an advantage over pixel-based approaches. Qualitative analysis shows that the learned embeddings coherently organise geographic space, recovering biome boundaries and urban-to-rural gradients.

Citation

@article{michail2026osmgraphclip,
  title   = {OSMGraphCLIP: Learning Global Location Representations from OpenStreetMap Graphs},
  author  = {Michail, Dimitrios and Saka, Eleni and Giannopoulos, Ioannis and Papoutsis, Ioannis},
  journal = {arXiv preprint arXiv:2606.08046},
  year    = {2026}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Paper for d-michail/OSMGraphCLIP-MS-L10

OSMGraphCLIP: Learning Global Location Representations from OpenStreetMap Graphs

Paper • 2606.08046 • Published 4 days ago