OSMGraphCLIP-MS-L10
A pretrained location encoder from the OSMGraphCLIP framework. It maps geographic coordinates (longitude, latitude) to dense vector embeddings that capture the semantic character of a location — its land use, built environment, road network, and landscape context — learned from freely available OpenStreetMap data.
This is the MS-L10 variant: multiscale spherical-harmonic bands with Legendre polynomial degree 10.
Model description
OSMGraphCLIP trains a CLIP-style contrastive model that aligns two views of a location:
- Graph encoder (OSMHeteroGAT): processes a heterogeneous OSM graph (points, lines, polygons — roads, buildings, land use, POIs) centered at the location, using SBERT node features.
- Location encoder: maps geographic coordinates through spherical-harmonic positional encodings and a SIREN network.
Symmetric cross-entropy loss aligns matching graph–coordinate pairs into a shared embedding space. After training, the location encoder alone is sufficient for inference — no OSM data is needed at query time. The graph encoder is only used during training.
Other pretrained variants (MS-L40, A-L40, A-L10) are available in the GitHub repository.
Intended uses
- Geographic/geospatial representation learning
- Downstream prediction tasks: climate, ecology, socioeconomics, public health, land cover, biodiversity, wildfire forecasting
- Location-conditioned retrieval or similarity search
- Any task that benefits from a semantically rich, globally consistent coordinate embedding
Training data
Approximately 200,000 globally-diverse locations sampled from:
satclip_locations.csv— primary location seth3_locations.csv— H3-sampled globally-uniform locations
For each location, an OSM graph was fetched and used as the graph encoder's input during training.
How to use
Install the package from the GitHub repository:
pip install git+https://github.com/d-michail/osmgraphclip.git
Python API:
import torch
from osmgraphclip.load import get_osmgraphclip_from_hf
# Load the location encoder (no OSM data needed at inference)
location_encoder = get_osmgraphclip_from_hf("osmgraphclip-ms-l10", device="cpu")
# coords: tensor of shape (N, 2) in (lon, lat) order
coords = torch.tensor([[13.40, 52.52]]) # Berlin
embedding = location_encoder(coords) # (N, D)
Command-line:
python infer.py --hf-model osmgraphclip-ms-l10 --lat 52.52 --lon 13.40
Note: coordinates must be provided in (longitude, latitude) order.
Evaluation
On a suite of downstream geospatial tasks (climate, ecology, socioeconomics, public health, land cover, biodiversity, wildfire forecasting), OSMGraphCLIP performs competitively with or surpasses satellite-imagery baselines. It shows particular strength on socioeconomic and public health tasks, where OSM's semantic annotations of the human-built environment offer an advantage over pixel-based approaches. Qualitative analysis shows that the learned embeddings coherently organise geographic space, recovering biome boundaries and urban-to-rural gradients.
Citation
@article{michail2026osmgraphclip,
title = {OSMGraphCLIP: Learning Global Location Representations from OpenStreetMap Graphs},
author = {Michail, Dimitrios and Saka, Eleni and Giannopoulos, Ioannis and Papoutsis, Ioannis},
journal = {arXiv preprint arXiv:2606.08046},
year = {2026}
}