--- library_name: transformers tags: - lora - peft - vision - safety - drone - pixtral - unsloth base_model: unsloth/pixtral-12b-2409-bnb-4bit license: apache-2.0 pipeline_tag: image-text-to-text --- # Helpstral — LoRA Fine-tuned Pixtral 12B for Drone Safety Assessment LoRA adapter for real-time pedestrian safety classification from drone camera images, built for the [Louise AI Safety Drone Escort](https://github.com/benbarrett735-png/Mistral-Worldwide-Hackathon) system. ## What it does Given a drone camera frame during an escort mission, the model outputs a structured threat assessment: - **threat_level** (1–10) — evidence-based risk score - **status** — SAFE, CAUTION, or DISTRESS - **people_count** — number of people visible in frame - **user_moving** — whether the escorted person appears to be walking - **proximity_alert** — whether another person is within ~3m of the user - **observations** — what the model sees (lighting, obstacles, people) - **pattern** — temporal reasoning from multi-frame context - **reasoning** — explanation connecting image + location data - **action** — CONTINUE_MONITORING, INCREASE_SCAN_RATE, ALERT_USER, EMERGENCY_HOVER, etc. This powers operator-in-the-loop alerts: when the user stops moving for 10+ seconds or another person is in close proximity, mission control receives a review request. ## Training | Parameter | Value | |-----------|-------| | Base model | Pixtral 12B (Unsloth 4-bit) | | Method | LoRA (PEFT), trained with Unsloth | | LoRA rank (r) | 64 | | LoRA alpha | 128 | | Target modules | language model attention (q_proj, v_proj, etc.) | | Task type | CAUSAL_LM | | PEFT version | 0.18.1 | ## Usage **Inference server (Colab):** See [`helpstral/serve_colab.ipynb`](https://github.com/benbarrett735-png/Mistral-Worldwide-Hackathon/blob/main/helpstral/serve_colab.ipynb) in the Louise repo. Run it on a T4 GPU, then set `HELPSTRAL_ENDPOINT=` in `.env`. **Load locally:** ```python import torch from transformers import AutoProcessor, LlavaForConditionalGeneration, BitsAndBytesConfig from peft import PeftModel from PIL import Image processor = AutoProcessor.from_pretrained("mistral-community/pixtral-12b") model = LlavaForConditionalGeneration.from_pretrained( "mistral-community/pixtral-12b", quantization_config=BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16), device_map="auto", ) model = PeftModel.from_pretrained(model, "BenBarr/helpstral") model = model.merge_and_unload().eval() img = Image.open("drone_frame.jpg").convert("RGB") chat = [{"role": "user", "content": [ {"type": "image"}, {"type": "text", "text": "Analyze this drone camera frame. Output JSON: threat_level, status, people_count, user_moving, proximity_alert, observations, pattern, reasoning, action."}, ]}] prompt = processor.apply_chat_template(chat, tokenize=False, add_generation_prompt=True) inputs = processor(text=prompt, images=[img], return_tensors="pt").to(model.device) with torch.no_grad(): out = model.generate(**inputs, max_new_tokens=400, do_sample=False) result = processor.batch_decode(out, skip_special_tokens=True)[0] # Parse JSON from result... ``` ## Architecture Helpstral sits in the Louise multi-agent drone escort system: - **Helpstral** (this model) — safety/threat assessment from camera images - **Flystral** — flight control from camera images ([BenBarr/flystral](https://huggingface.co/BenBarr/flystral)) - **Louise** — conversational safety companion (Ministral 3B) When the fine-tuned endpoint is available, Helpstral uses this adapter. When offline, it falls back to Pixtral 12B via the Mistral API with function calling (queries real OpenStreetMap data for streetlight density, etc.). ## Developed by Ben Barrett — Mistral Worldwide Hackathon 2026