Spaces:
Sleeping
Sleeping
metadata
title: Gazet
emoji: 🗺
colorFrom: green
colorTo: blue
sdk: docker
app_port: 7860
Gazet
Lean natural-language geocoder with GIS operations over Overture and Natural Earth parquet datasets.
Gazet is built to be easily packagable and minimal in setup, trying to push the boundaries on how small we can go in setup for LLM driven data applications. It is built for working with small language models and parquet files.
The name inspired by Gazetteer. A gazetteer is a geographical dictionary or directory used in conjunction with a map or atlas.
Local setup
Python setup
Install python dependencies using uv
uv sync --extra dev --extra demo
Data preparation
- Download Overture divisions data
- Download the 10m physical layer from Natural Earth
- Unzip the data
- Convert natural earth data to parquet
Example for downloading overture
aws s3 sync s3://overturemaps-us-west-2/release/2026-02-18.0/theme=divisions/type=division_area/ data/overture/divisions_area
Example for running conversion script for natural earth
unzip ~/Downloads/10m_physical.zip -d data/natural_earth
python -m ingest.convert_natural_earth data/natural_earth
Based on ollama
For now, gazet relies on ollama. For remote (cloud) models, ensure you are loged into Ollama.
Usage
python -m gazet
# then GET http://localhost:8000/search?q=Border%20between%20Loja%20and%20Piura
API + Streamlit demo
uv run uvicorn gazet.api:app --reload # API on :8000
uv run streamlit run gazet_demo.py # demo UI
Modules
| Module | Contents |
|---|---|
config.py |
data paths, model name, SQL schema description |
schemas.py |
SUBTYPES, COUNTRIES, Place, PlacesResult |
lm.py |
DSPy signatures + LM init (extract, write_sql) |
search.py |
fuzzy search against divisions_area / natural_earth |
sql.py |
code-act SQL generation loop |
export.py |
GeoJSON FeatureCollection writer |
api.py |
FastAPI app with /search?q=... returning GeoJSON FeatureCollection |
Design notes
api.pyexposes GET/search?q=<query>; returns GeoJSON FeatureCollection and logs intermediate output.- LM is initialised at import time in
lm.py, suitable for a long-lived server process. - Data lives in
data/overture/anddata/natural_earth_geoparquet/(not tracked in git).
Attributions
Logo icon: search globe by popcornarts from Noun Project (CC BY 3.0)