| --- |
| language: en |
| license: apache-2.0 |
| tags: |
| - vision-language-model |
| - image-text-to-text |
| - pytorch |
| - disaster-detection |
| datasets: |
| - TurkishCodeMan/Sentinel2-Turkey-VLM |
| base_model: LiquidAI/LFM2.5-VL-450M |
| --- |
| # LFM2.5-VL-450M Pre-Disaster Detection |
|
|
| This is a fine-tuned version of `LiquidAI/LFM2.5-VL-450M` on the `TurkishCodeMan/Sentinel2-Turkey-VLM` dataset. |
|
|
| ## Model Details |
| - **Base Model:** [LiquidAI/LFM2.5-VL-450M](https://huggingface.co/LiquidAI/LFM2.5-VL-450M) |
| - **Dataset:** [TurkishCodeMan/Sentinel2-Turkey-VLM](https://huggingface.co/datasets/TurkishCodeMan/Sentinel2-Turkey-VLM) |
| - **Task:** Vision-Language Pre-Disaster Feature Extraction from Sentinel-2 RGB and CIR imagery. |
|
|
| ## Training Strategy |
| - **Fine-Tuning Method:** Full Fine-Tuning (Full FT) |
| - **Epochs:** 5 |
| - **Learning Rate:** 2e-5 (Cosine Scheduler) |
| - **Batch Size:** 2 (Gradient Accumulation: 4) |
| - **Loss Masking:** Assistant loss masking enabled (prompt and padding tokens masked with -100). |
| - **Hardware:** 1x GPU (bfloat16) |
|
|
| The model was trained to output precisely formatted JSON matching 5 core disaster risk features (Risk Level, Dry Vegetation, Steep Terrain, Water Body, Image Quality). It achieved 84% exact-match overall accuracy on the test set with 100% valid JSON formatting. |
|
|
| ## Usage |
|
|
| ```python |
| import torch |
| from PIL import Image |
| from transformers import AutoProcessor, AutoModelForImageTextToText |
| |
| model_id = "TurkishCodeMan/LFM2.5-VL-450M-Pre-Disaster-Detection" |
| processor = AutoProcessor.from_pretrained(model_id) |
| model = AutoModelForImageTextToText.from_pretrained(model_id, device_map="auto", dtype="bfloat16") |
| |
| # Load your RGB and CIR images, then merge them horizontally |
| # merged_image = merge_images(rgb, cir) |
| |
| SYSTEM_PROMPT = """Extract the following from the image: |
| |
| risk_level: The fire risk level of the forest based on the right side CIR image, select from Low, Medium, or High |
| dry_vegetation_present: Is dry vegetation present (which appears as greyish/brownish/non-red in the right side CIR image)?, select from true, or false |
| steep_terrain: Is the terrain steep based on the images?, select from true, or false |
| water_body_present: Is there a water body (lake, river) present in the left RGB image?, select from true, or false |
| image_quality_limited: Is the image quality limited or very cloudy?, select from true, or false |
| |
| Respond with only a JSON object. Do not include any text outside the JSON.""" |
| |
| conversation = [ |
| {"role": "system", "content": [{"type": "text", "text": SYSTEM_PROMPT}]}, |
| {"role": "user", "content": [{"type": "image", "image": merged_image}]} |
| ] |
| |
| inputs = processor.apply_chat_template(conversation, return_tensors="pt", return_dict=True, tokenize=True, add_generation_prompt=True) |
| inputs = {k: v.to(model.device) for k, v in inputs.items()} |
| |
| with torch.no_grad(): |
| output_ids = model.generate(**inputs, max_new_tokens=150, temperature=0.1) |
| |
| generated_ids = output_ids[0][inputs["input_ids"].shape[1]:] |
| generated_text = processor.decode(generated_ids, skip_special_tokens=True).strip() |
| print(generated_text) |
| ``` |
|
|