Text Generation
PEFT
Safetensors
English
code
gis
geospatial
geopandas
shapely
rasterio
osmnx
folium
lora
trl
sft
conversational
Instructions to use RhodWeo/GIS-Coder-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use RhodWeo/GIS-Coder-7B with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-0.5B-Instruct") model = PeftModel.from_pretrained(base_model, "RhodWeo/GIS-Coder-7B") - Notebooks
- Google Colab
- Kaggle
File size: 3,157 Bytes
b0627d1 50b1aaa b0627d1 50b1aaa b0627d1 50b1aaa b0627d1 ad4257a b0627d1 ad4257a b0627d1 ad4257a 50b1aaa ad4257a 50b1aaa ad4257a 50b1aaa ad4257a 50b1aaa ad4257a b0627d1 ad4257a 50b1aaa ad4257a 50b1aaa ad4257a b0627d1 ad4257a b0627d1 ad4257a b0627d1 ad4257a b0627d1 ad4257a b0627d1 ad4257a b0627d1 ad4257a b0627d1 ad4257a b0627d1 ad4257a b0627d1 50b1aaa b0627d1 ad4257a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 | ---
license: apache-2.0
base_model: Qwen/Qwen2.5-Coder-0.5B-Instruct
tags:
- code
- gis
- geospatial
- geopandas
- shapely
- rasterio
- osmnx
- folium
- peft
- lora
- trl
- sft
language:
- en
pipeline_tag: text-generation
library_name: peft
---
# GIS-Coder β A Code Model for Geographic Information Systems
A LoRA-adapted code model specialized for GIS and geospatial Python programming. Includes a **ready-to-run training package** for scaling up to 7B on your own GPU cluster.
## π¦ This Repo Contains
| File | Description |
|------|-------------|
| `adapter_model.safetensors` | Trained LoRA adapter (0.5B base, proof of concept) |
| `train_7b.py` | **Production 7B QLoRA training script** with CLI args |
| `evaluate.py` | Evaluation suite (12 GIS benchmarks with scoring) |
| `requirements.txt` | All dependencies |
| `TRAINING_README.md` | **Detailed training guide** β hardware, hyperparameters, ablations |
## π Train the 7B Model on Your GPUs
```bash
# 1. Clone this repo
git clone https://huggingface.co/RhodWeo/GIS-Coder-7B
cd GIS-Coder-7B
# 2. Install deps
pip install -r requirements.txt
# 3. Login
huggingface-cli login
# 4. Train! (A100 80GB recommended)
python train_7b.py
# For A10G/RTX 4090 (24GB):
python train_7b.py --batch_size 1 --grad_accum 16 --max_length 2048
# For H100:
python train_7b.py --batch_size 4 --grad_accum 4 --max_length 8192
# 5. Evaluate
python evaluate.py --adapter_id ./gis-coder-7b-output/final --compare_base
```
See **[TRAINING_README.md](TRAINING_README.md)** for the full guide with hardware-specific settings, ablation ideas, and expected results.
## πΊοΈ GIS Libraries Covered (13)
| Priority | Libraries | Coverage |
|----------|-----------|----------|
| **Tier 1** (0% baseline) | OSMnx, MovingPandas, Rasterio, GDAL, PyProj | Heavy β these are where models fail |
| **Tier 2** | GeoPandas, Shapely, H3 | Core GIS operations |
| **Tier 3** | Folium, xarray, PyQGIS, Fiona, PySAL | Real-world workflows |
## π Proof-of-Concept Results (0.5B)
Trained on CPU with the smaller base model to validate the approach:
| Metric | Start β End |
|--------|------------|
| **Loss** | 1.52 β 0.88 (β42%) |
| **Token Accuracy** | 69.3% β **79.3%** (+10pp) |
| **Eval Quality** | **85%** (code + library + CoT + function) |
## π¬ Training Recipe
Based on published research:
| Principle | Source | Applied |
|-----------|--------|---------|
| QLoRA SFT beats 72B models | [CFD paper](https://arxiv.org/abs/2504.09602) | r=32, all-linear, lr=2e-4 |
| Qwen2.5-Coder best backbone | [MapCoder-Lite](https://arxiv.org/abs/2509.17489) | Base model selection |
| Models score 0% on GIS | [GIS Benchmark](https://arxiv.org/abs/2410.04617) | Heavy OSMnx/MovingPandas coverage |
| CoT boosts +20.9% pass@1 | CFD paper ablation | All examples include CoT |
| Target all linear layers | [LoRA Without Regret](https://arxiv.org/abs/2410.13732) | `target_modules="all-linear"` |
## π Dataset
**[RhodWeo/gis-code-instructions](https://huggingface.co/datasets/RhodWeo/gis-code-instructions)** β 70 expert-curated examples with Chain-of-Thought annotations.
## License
Apache 2.0
|