File size: 1,159 Bytes
d9835e8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58833c7
d9835e8
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
---
license: apache-2.0
datasets:
- Dauka-transformers/Compact_VLM_filter_data
language:
- en
base_model:
- Qwen/Qwen2-VL-2B-Instruct
---
# Qwen2VL Fine-Tuned for Filtration Tasks

This model is a fine-tuned version of [Qwen/Qwen2-VL-2B-Instruct](https://huggingface.co/Qwen/Qwen-VL) trained to perform filtration-oriented image-text evaluation, based on our custom dataset.

## πŸ” Intended Use

The model is designed to:

- Evaluate alignment of image and caption
- Provide justification scores for noisy web-scale data
- Support local deployment for cost-efficient filtering

## πŸ‹οΈ Training Details

- Base model: `Qwen/Qwen2-VL-2B-Instruct`
- Fine-tuning objective: in-context scoring + justification
- Dataset: ~4.8K samples with score, justification, text, and image

## πŸ“ Files

- `model.safetensors` – fine-tuned weights
- `processor` – image and text processor
- `README.md` – this card

## 🀝 Acknowledgements

Thanks to the [Qwen team](https://huggingface.co/Qwen/Qwen-VL) for open-sourcing their VLM models, which serve as the foundation for our filtration-oriented model.


## πŸ“œ License

Licensed under the Apache License 2.0.