---
language: en
license: apache-2.0
tags:
- vision-language-model
- image-text-to-text
- pytorch
- disaster-detection
datasets:
- TurkishCodeMan/Sentinel2-Turkey-VLM
base_model: LiquidAI/LFM2.5-VL-450M
---
# LFM2.5-VL-450M Pre-Disaster Detection

This is a fine-tuned version of `LiquidAI/LFM2.5-VL-450M` on the `TurkishCodeMan/Sentinel2-Turkey-VLM` dataset.

## Model Details
- **Base Model:** [LiquidAI/LFM2.5-VL-450M](https://huggingface.co/LiquidAI/LFM2.5-VL-450M)
- **Dataset:** [TurkishCodeMan/Sentinel2-Turkey-VLM](https://huggingface.co/datasets/TurkishCodeMan/Sentinel2-Turkey-VLM)
- **Task:** Vision-Language Pre-Disaster Feature Extraction from Sentinel-2 RGB and CIR imagery.

## Training Strategy
- **Fine-Tuning Method:** Full Fine-Tuning (Full FT)
- **Epochs:** 5
- **Learning Rate:** 2e-5 (Cosine Scheduler)
- **Batch Size:** 2 (Gradient Accumulation: 4)
- **Loss Masking:** Assistant loss masking enabled (prompt and padding tokens masked with -100).
- **Hardware:** 1x GPU (bfloat16)

The model was trained to output precisely formatted JSON matching 5 core disaster risk features (Risk Level, Dry Vegetation, Steep Terrain, Water Body, Image Quality). It achieved 84% exact-match overall accuracy on the test set with 100% valid JSON formatting.

## Usage

```python
import torch
from PIL import Image
from transformers import AutoProcessor, AutoModelForImageTextToText

model_id = "TurkishCodeMan/LFM2.5-VL-450M-Pre-Disaster-Detection"
processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForImageTextToText.from_pretrained(model_id, device_map="auto", dtype="bfloat16")

# Load your RGB and CIR images, then merge them horizontally
# merged_image = merge_images(rgb, cir)

SYSTEM_PROMPT = """Extract the following from the image:

risk_level: The fire risk level of the forest based on the right side CIR image, select from Low, Medium, or High
dry_vegetation_present: Is dry vegetation present (which appears as greyish/brownish/non-red in the right side CIR image)?, select from true, or false
steep_terrain: Is the terrain steep based on the images?, select from true, or false
water_body_present: Is there a water body (lake, river) present in the left RGB image?, select from true, or false
image_quality_limited: Is the image quality limited or very cloudy?, select from true, or false

Respond with only a JSON object. Do not include any text outside the JSON."""

conversation = [
    {"role": "system", "content": [{"type": "text", "text": SYSTEM_PROMPT}]},
    {"role": "user", "content": [{"type": "image", "image": merged_image}]}
]

inputs = processor.apply_chat_template(conversation, return_tensors="pt", return_dict=True, tokenize=True, add_generation_prompt=True)
inputs = {k: v.to(model.device) for k, v in inputs.items()}

with torch.no_grad():
    output_ids = model.generate(**inputs, max_new_tokens=150, temperature=0.1)

generated_ids = output_ids[0][inputs["input_ids"].shape[1]:]
generated_text = processor.decode(generated_ids, skip_special_tokens=True).strip()
print(generated_text)
```