--- language: en license: apache-2.0 tags: - vision-language-model - image-text-to-text - pytorch - disaster-detection datasets: - TurkishCodeMan/Sentinel2-Turkey-VLM base_model: LiquidAI/LFM2.5-VL-450M --- # LFM2.5-VL-450M Pre-Disaster Detection This is a fine-tuned version of `LiquidAI/LFM2.5-VL-450M` on the `TurkishCodeMan/Sentinel2-Turkey-VLM` dataset. ## Model Details - **Base Model:** [LiquidAI/LFM2.5-VL-450M](https://huggingface.co/LiquidAI/LFM2.5-VL-450M) - **Dataset:** [TurkishCodeMan/Sentinel2-Turkey-VLM](https://huggingface.co/datasets/TurkishCodeMan/Sentinel2-Turkey-VLM) - **Task:** Vision-Language Pre-Disaster Feature Extraction from Sentinel-2 RGB and CIR imagery. ## Training Strategy - **Fine-Tuning Method:** Full Fine-Tuning (Full FT) - **Epochs:** 5 - **Learning Rate:** 2e-5 (Cosine Scheduler) - **Batch Size:** 2 (Gradient Accumulation: 4) - **Loss Masking:** Assistant loss masking enabled (prompt and padding tokens masked with -100). - **Hardware:** 1x GPU (bfloat16) The model was trained to output precisely formatted JSON matching 5 core disaster risk features (Risk Level, Dry Vegetation, Steep Terrain, Water Body, Image Quality). It achieved 84% exact-match overall accuracy on the test set with 100% valid JSON formatting. ## Usage ```python import torch from PIL import Image from transformers import AutoProcessor, AutoModelForImageTextToText model_id = "TurkishCodeMan/LFM2.5-VL-450M-Pre-Disaster-Detection" processor = AutoProcessor.from_pretrained(model_id) model = AutoModelForImageTextToText.from_pretrained(model_id, device_map="auto", dtype="bfloat16") # Load your RGB and CIR images, then merge them horizontally # merged_image = merge_images(rgb, cir) SYSTEM_PROMPT = """Extract the following from the image: risk_level: The fire risk level of the forest based on the right side CIR image, select from Low, Medium, or High dry_vegetation_present: Is dry vegetation present (which appears as greyish/brownish/non-red in the right side CIR image)?, select from true, or false steep_terrain: Is the terrain steep based on the images?, select from true, or false water_body_present: Is there a water body (lake, river) present in the left RGB image?, select from true, or false image_quality_limited: Is the image quality limited or very cloudy?, select from true, or false Respond with only a JSON object. Do not include any text outside the JSON.""" conversation = [ {"role": "system", "content": [{"type": "text", "text": SYSTEM_PROMPT}]}, {"role": "user", "content": [{"type": "image", "image": merged_image}]} ] inputs = processor.apply_chat_template(conversation, return_tensors="pt", return_dict=True, tokenize=True, add_generation_prompt=True) inputs = {k: v.to(model.device) for k, v in inputs.items()} with torch.no_grad(): output_ids = model.generate(**inputs, max_new_tokens=150, temperature=0.1) generated_ids = output_ids[0][inputs["input_ids"].shape[1]:] generated_text = processor.decode(generated_ids, skip_special_tokens=True).strip() print(generated_text) ```