--- language: - en - bn - es - id license: apache-2.0 base_model: google/gemma-4-e2b-it tags: - unsloth - gemma4 - trl - spatial - humanitarian - tool-calling - lora - offline - edge-ai - raspberry-pi pipeline_tag: text-generation --- # gemma-4-e2b-spatial-lora > **Gemma 4 E2B fine-tuned for humanitarian spatial tool calling** > Part of [GemmaTerrain](https://huggingface.co/tasfuuu19) — Multimodal GeoAI for the Unconnected [![Base Model](https://img.shields.io/badge/Base-Gemma%204%20E2B-blue)](https://huggingface.co/google/gemma-4-e2b-it) [![License](https://img.shields.io/badge/License-Apache%202.0-yellow)](https://opensource.org/licenses/Apache-2.0) [![Trained with Unsloth](https://img.shields.io/badge/Trained%20with-Unsloth-purple)](https://github.com/unslothai/unsloth) --- ## Overview This is a **LoRA adapter** for `google/gemma-4-e2b-it`, fine-tuned to output structured JSON tool calls for humanitarian spatial queries. It runs offline on ARM edge hardware (Raspberry Pi 5, Steam Deck) as part of the GemmaTerrain system — no internet, no cloud, no GPU required. Given a natural language query in English, Bangla, Spanish, or Indonesian, the model selects exactly one spatial tool and outputs its arguments as JSON. **Example:** | Input | Output | |---|---| | `"Find the nearest hospital to Camp 6"` | `{"name": "find_nearest_poi_with_route", "arguments": {"poi_type": "hospital", "lat": 21.2, "lon": 92.16}}` | | `"15 minute walking radius from Condado"` | `{"name": "generate_isochrone", "arguments": {"lat": 18.46, "lon": -66.07, "max_minutes": 15}}` | | `"Camp 6 er kache hospital kothay?"` | `{"name": "find_nearest_poi_with_route", "arguments": {"poi_type": "hospital", "lat": 21.2, "lon": 92.16}}` | | `"Farmacias dentro de 1km de Ocean Park"` | `{"name": "list_pois", "arguments": {"poi_type": "pharmacy", "lat": 18.46, "lon": -66.05, "radius_m": 1000}}` | --- ## Spatial Tools The model is trained to call exactly one of these 6 tools: | Tool | Use case | |---|---| | `find_nearest_poi_with_route` | "Nearest hospital to X" — returns closest POI + walking route | | `list_pois` | "Clinics within 2km of X" — returns all POIs in radius | | `calculate_route` | "Walk from A to B" — returns distance + time | | `generate_isochrone` | "15 min walking area from X" — returns reachable boundary | | `find_along_route` | "Pharmacies along the way from A to B" | | `geocode_place` | "Where is X?" — resolves place name to coordinates | **Supported POI types:** `hospital`, `clinic`, `doctors`, `pharmacy`, `police`, `fire_station`, `shelter`, `school`, `university`, `bank`, `atm`, `supermarket`, `marketplace`, `drinking_water`, `water_point`, `fuel`, `bus_station`, `place_of_worship` --- ## Training Details | Parameter | Value | |---|---| | Base model | `google/gemma-4-e2b-it` | | Method | QLoRA (4-bit quantization) | | LoRA rank | 16 | | LoRA alpha | 32 | | Target modules | `q_proj`, `k_proj`, `v_proj`, `o_proj` | | Max sequence length | 512 tokens | | Training examples | 28 unique × augmented to 5,000 | | Epochs | 3 | | Batch size | 2 (effective: 8 with grad accumulation) | | Learning rate | 2e-4 | | Precision | fp16 | | Framework | Unsloth + TRL SFTTrainer | | Hardware | Kaggle T4 (15GB) | | Training time | ~45 minutes | --- ## Dataset 28 hand-crafted humanitarian spatial query → tool call pairs across **3 real-world locations** and **4 languages**: | Location | Context | |---|---| | **Cox's Bazar, Bangladesh** | Rohingya refugee camps (Camp 3, 6, 8W, 9, 12, 15) | | **San Juan, Puerto Rico** | Post-hurricane disaster response (Condado, Santurce, Miramar) | | **Jakarta, Indonesia** | Urban humanitarian operations (Menteng, Gelora, Gambir, Kemang) | **Languages:** English · Bangla (transliterated) · Spanish · Indonesian **Query types:** nearest POI, radius search, route calculation, isochrone, along-route search --- ## System Prompt ``` You are GemmaTerrain, a humanitarian spatial assistant running offline on edge hardware. Select exactly ONE tool. Output only the tool call JSON — no explanation. Valid poi_type values: hospital, clinic, doctors, pharmacy, police, fire_station, shelter, school, university, bank, atm, supermarket, marketplace, drinking_water, water_point, fuel, bus_station, place_of_worship ``` --- ## Usage ### With Unsloth (recommended) ```python from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="tasfuuu19/gemma-4-e2b-spatial-lora", max_seq_length=512, load_in_4bit=True, ) ``` ### With Transformers ```python from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer base = AutoModelForCausalLM.from_pretrained("google/gemma-4-e2b-it", load_in_4bit=True) model = PeftModel.from_pretrained(base, "tasfuuu19/gemma-4-e2b-spatial-lora") tokenizer = AutoTokenizer.from_pretrained("tasfuuu19/gemma-4-e2b-spatial-lora") ``` ### Inference example ```python messages = [ {"role": "system", "content": "You are Meridian, a humanitarian spatial assistant..."}, {"role": "user", "content": "Find the nearest hospital to Camp 6"}, ] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens=128, temperature=0.1) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) # {"name": "find_nearest_poi_with_route", "arguments": {"poi_type": "hospital", "lat": 21.2, "lon": 92.16}} ``` --- ## GemmaTerrain System This adapter is the **E2B component** of the GemmaTerrain dual-model routing system: ``` Query → Geocode Layer → Cactus Router → Gemma 4 E2B (this model) → Spatial Tool → Result ↘ Gemma 4 E4B (complex/multimodal queries) ``` The router sends queries to E2B for simple spatial lookups (nearest, list, route, isochrone) and to E4B for multimodal or complex multi-hop reasoning. Battery level is also a routing signal — below 20%, all queries go to E2B. The spatial backend uses **NetworKit** (Dijkstra routing) + **DuckDB** (spatial queries) on OpenStreetMap data, running fully offline. --- ## Limitations - Training set is small (28 unique examples) — generalization relies heavily on the base model - Coordinates in training data are fixed to 3 locations; novel locations require geocoding pre-processing - Walking speed hardcoded at 5 km/h (83.33 m/min) - Not suitable for driving or cycling routing --- ## License Apache 2.0 — same as the base Gemma 4 model.