Text Generation
PEFT
Safetensors
English
code
gis
geospatial
geopandas
shapely
rasterio
osmnx
folium
lora
trl
sft
conversational
Instructions to use RhodWeo/GIS-Coder-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use RhodWeo/GIS-Coder-7B with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-0.5B-Instruct") model = PeftModel.from_pretrained(base_model, "RhodWeo/GIS-Coder-7B") - Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| base_model: Qwen/Qwen2.5-Coder-0.5B-Instruct | |
| tags: | |
| - code | |
| - gis | |
| - geospatial | |
| - geopandas | |
| - shapely | |
| - rasterio | |
| - osmnx | |
| - folium | |
| - peft | |
| - lora | |
| - trl | |
| - sft | |
| language: | |
| - en | |
| pipeline_tag: text-generation | |
| library_name: peft | |
| # GIS-Coder β A Code Model for Geographic Information Systems | |
| A LoRA-adapted code model specialized for GIS and geospatial Python programming. Includes a **ready-to-run training package** for scaling up to 7B on your own GPU cluster. | |
| ## π¦ This Repo Contains | |
| | File | Description | | |
| |------|-------------| | |
| | `adapter_model.safetensors` | Trained LoRA adapter (0.5B base, proof of concept) | | |
| | `train_7b.py` | **Production 7B QLoRA training script** with CLI args | | |
| | `evaluate.py` | Evaluation suite (12 GIS benchmarks with scoring) | | |
| | `requirements.txt` | All dependencies | | |
| | `TRAINING_README.md` | **Detailed training guide** β hardware, hyperparameters, ablations | | |
| ## π Train the 7B Model on Your GPUs | |
| ```bash | |
| # 1. Clone this repo | |
| git clone https://huggingface.co/RhodWeo/GIS-Coder-7B | |
| cd GIS-Coder-7B | |
| # 2. Install deps | |
| pip install -r requirements.txt | |
| # 3. Login | |
| huggingface-cli login | |
| # 4. Train! (A100 80GB recommended) | |
| python train_7b.py | |
| # For A10G/RTX 4090 (24GB): | |
| python train_7b.py --batch_size 1 --grad_accum 16 --max_length 2048 | |
| # For H100: | |
| python train_7b.py --batch_size 4 --grad_accum 4 --max_length 8192 | |
| # 5. Evaluate | |
| python evaluate.py --adapter_id ./gis-coder-7b-output/final --compare_base | |
| ``` | |
| See **[TRAINING_README.md](TRAINING_README.md)** for the full guide with hardware-specific settings, ablation ideas, and expected results. | |
| ## πΊοΈ GIS Libraries Covered (13) | |
| | Priority | Libraries | Coverage | | |
| |----------|-----------|----------| | |
| | **Tier 1** (0% baseline) | OSMnx, MovingPandas, Rasterio, GDAL, PyProj | Heavy β these are where models fail | | |
| | **Tier 2** | GeoPandas, Shapely, H3 | Core GIS operations | | |
| | **Tier 3** | Folium, xarray, PyQGIS, Fiona, PySAL | Real-world workflows | | |
| ## π Proof-of-Concept Results (0.5B) | |
| Trained on CPU with the smaller base model to validate the approach: | |
| | Metric | Start β End | | |
| |--------|------------| | |
| | **Loss** | 1.52 β 0.88 (β42%) | | |
| | **Token Accuracy** | 69.3% β **79.3%** (+10pp) | | |
| | **Eval Quality** | **85%** (code + library + CoT + function) | | |
| ## π¬ Training Recipe | |
| Based on published research: | |
| | Principle | Source | Applied | | |
| |-----------|--------|---------| | |
| | QLoRA SFT beats 72B models | [CFD paper](https://arxiv.org/abs/2504.09602) | r=32, all-linear, lr=2e-4 | | |
| | Qwen2.5-Coder best backbone | [MapCoder-Lite](https://arxiv.org/abs/2509.17489) | Base model selection | | |
| | Models score 0% on GIS | [GIS Benchmark](https://arxiv.org/abs/2410.04617) | Heavy OSMnx/MovingPandas coverage | | |
| | CoT boosts +20.9% pass@1 | CFD paper ablation | All examples include CoT | | |
| | Target all linear layers | [LoRA Without Regret](https://arxiv.org/abs/2410.13732) | `target_modules="all-linear"` | | |
| ## π Dataset | |
| **[RhodWeo/gis-code-instructions](https://huggingface.co/datasets/RhodWeo/gis-code-instructions)** β 70 expert-curated examples with Chain-of-Thought annotations. | |
| ## License | |
| Apache 2.0 | |