TurkishCodeMan
/

LFM2.5-VL-450M-Pre-Disaster-Detection

Image-Text-to-Text

vision-language-model

disaster-detection

Model card Files Files and versions

LFM2.5-VL-450M-Pre-Disaster-Detection / README.md

TurkishCodeMan's picture

Upload README.md with huggingface_hub

38e9a29 verified 9 days ago

|

History Blame Contribute Delete

3.05 kB

	---
	language: en
	license: apache-2.0
	tags:
	- vision-language-model
	- image-text-to-text
	- pytorch
	- disaster-detection
	datasets:
	- TurkishCodeMan/Sentinel2-Turkey-VLM
	base_model: LiquidAI/LFM2.5-VL-450M
	---
	# LFM2.5-VL-450M Pre-Disaster Detection

	This is a fine-tuned version of `LiquidAI/LFM2.5-VL-450M` on the `TurkishCodeMan/Sentinel2-Turkey-VLM` dataset.

	## Model Details
	- Base Model: [LiquidAI/LFM2.5-VL-450M](https://huggingface.co/LiquidAI/LFM2.5-VL-450M)
	- Dataset: [TurkishCodeMan/Sentinel2-Turkey-VLM](https://huggingface.co/datasets/TurkishCodeMan/Sentinel2-Turkey-VLM)
	- Task: Vision-Language Pre-Disaster Feature Extraction from Sentinel-2 RGB and CIR imagery.

	## Training Strategy
	- Fine-Tuning Method: Full Fine-Tuning (Full FT)
	- Epochs: 5
	- Learning Rate: 2e-5 (Cosine Scheduler)
	- Batch Size: 2 (Gradient Accumulation: 4)
	- Loss Masking: Assistant loss masking enabled (prompt and padding tokens masked with -100).
	- Hardware: 1x GPU (bfloat16)

	The model was trained to output precisely formatted JSON matching 5 core disaster risk features (Risk Level, Dry Vegetation, Steep Terrain, Water Body, Image Quality). It achieved 84% exact-match overall accuracy on the test set with 100% valid JSON formatting.

	## Usage

	```python
	import torch
	from PIL import Image
	from transformers import AutoProcessor, AutoModelForImageTextToText

	model_id = "TurkishCodeMan/LFM2.5-VL-450M-Pre-Disaster-Detection"
	processor = AutoProcessor.from_pretrained(model_id)
	model = AutoModelForImageTextToText.from_pretrained(model_id, device_map="auto", dtype="bfloat16")

	# Load your RGB and CIR images, then merge them horizontally
	# merged_image = merge_images(rgb, cir)

	SYSTEM_PROMPT = """Extract the following from the image:

	risk_level: The fire risk level of the forest based on the right side CIR image, select from Low, Medium, or High
	dry_vegetation_present: Is dry vegetation present (which appears as greyish/brownish/non-red in the right side CIR image)?, select from true, or false
	steep_terrain: Is the terrain steep based on the images?, select from true, or false
	water_body_present: Is there a water body (lake, river) present in the left RGB image?, select from true, or false
	image_quality_limited: Is the image quality limited or very cloudy?, select from true, or false

	Respond with only a JSON object. Do not include any text outside the JSON."""

	conversation = [
	{"role": "system", "content": [{"type": "text", "text": SYSTEM_PROMPT}]},
	{"role": "user", "content": [{"type": "image", "image": merged_image}]}
	]

	inputs = processor.apply_chat_template(conversation, return_tensors="pt", return_dict=True, tokenize=True, add_generation_prompt=True)
	inputs = {k: v.to(model.device) for k, v in inputs.items()}

	with torch.no_grad():
	output_ids = model.generate(**inputs, max_new_tokens=150, temperature=0.1)

	generated_ids = output_ids[0][inputs["input_ids"].shape[1]:]
	generated_text = processor.decode(generated_ids, skip_special_tokens=True).strip()
	print(generated_text)
	```