lmms-lab/LLaVA-OneVision-Data
Viewer • Updated • 3.94M • 20.4k • 238
| Model Name | Base Model | Description |
|---|---|---|
| llava-alternating-attn-within-modality-qwen2-0.5b-ov | lmms-lab/llava-onevision-qwen2-0.5b-ov | Alternating attention architecture that restricts attention within modalities in alternating layers of the transformer blocks. |
| llava-alternating-attn-cross-modality-qwen2-0.5b-ov | lmms-lab/llava-onevision-qwen2-0.5b-ov | Alternating attention architecture that restricts attention to self-only and other modalities in alternating layers of the transformer blocks. |
| Model Name | Base Model | Description |
|---|---|---|
| llava-alternating-attn-within-modality-qwen2-0.5b-ov-instructiontuned-visualcorres | llava-alternating-attn-within-modality-qwen2-0.5b-ov | Instruction-tuned variant of the within-modality alternating attention model, fine-tuned specifically for visual correspondence tasks (matching corresponding regions across images). |
| llava-alternating-attn-cross-modality-qwen2-0.5b-ov-instructiontuned-visualcorres | llava-alternating-attn-cross-modality-qwen2-0.5b-ov | Instruction-tuned variant of the cross-modality alternating attention model, fine-tuned specifically for visual correspondence tasks (matching corresponding regions across images). |
| llava-onevision-qwen2-0.5b-ov-instructiontuned-visualcorres | lmms-lab/llava-onevision-qwen2-0.5b-ov | Instruction-tuned baseline model (standard LLaVA-OneVision architecture) fine-tuned for visual correspondence tasks, provided for comparison with alternating attention variants. |
Base model
lmms-lab/llava-onevision-qwen2-0.5b-ov