--- license: cc-by-nc-4.0 task_categories: - visual-question-answering - image-to-text language: - en tags: - vision-language - multimodal - visual-question-answering - measurement-grounding - raw-image - camera-raw - meas-xyz - low-light - hdr - benchmark - prsimvl pretty_name: kepeng/PRSIMVL-LoRA-V1 --- # Released PRSIMVL Weights This folder stores the released PRSIMVL LoRA adapter layout used by inference and evaluation. ## Expected Checkpoints | Size | Base Model | Local LoRA Checkpoint | |---|---|---| | 2B | `Qwen/Qwen3-VL-2B-Instruct` | `BANALCED_150K_META_VIT_PROXY/output-Qwen3-VL-2B-Instruct/v8-20260421-133546/checkpoint-95000` | | 4B | `Qwen/Qwen3-VL-4B-Instruct` | `BANALCED_150K_META_VIT_PROXY/output-Qwen3-VL-4B-Instruct/v12-20260425-113029/checkpoint-85000` | | 8B | `Qwen/Qwen3-VL-8B-Instruct` | `BANALCED_150K_META_VIT_PROXY/output-Qwen3-VL-8B-Instruct/v2-20260423-205317/checkpoint-95000` | ``` @misc{xu2026allegory, title = {Allegory of the Cave: Measurement-Grounded Vision-Language Learning}, author = {Xu, Kepeng and Xu, Li and He, Gang and Yu, Wenxin}, year = {2026}, eprint = {2605.11727}, archivePrefix = {arXiv}, url = {https://arxiv.org/abs/2605.11727} } ```