| --- |
| license: cc-by-nc-4.0 |
| task_categories: |
| - visual-question-answering |
| - image-to-text |
| language: |
| - en |
| tags: |
| - vision-language |
| - multimodal |
| - visual-question-answering |
| - measurement-grounding |
| - raw-image |
| - camera-raw |
| - meas-xyz |
| - low-light |
| - hdr |
| - benchmark |
| - prsimvl |
| pretty_name: kepeng/PRSIMVL-LoRA-V1 |
| --- |
| |
| # Released PRSIMVL Weights |
|
|
| This folder stores the released PRSIMVL LoRA adapter layout used by inference and evaluation. |
|
|
|
|
| ## Expected Checkpoints |
|
|
| | Size | Base Model | Local LoRA Checkpoint | |
| |---|---|---| |
| | 2B | `Qwen/Qwen3-VL-2B-Instruct` | `BANALCED_150K_META_VIT_PROXY/output-Qwen3-VL-2B-Instruct/v8-20260421-133546/checkpoint-95000` | |
| | 4B | `Qwen/Qwen3-VL-4B-Instruct` | `BANALCED_150K_META_VIT_PROXY/output-Qwen3-VL-4B-Instruct/v12-20260425-113029/checkpoint-85000` | |
| | 8B | `Qwen/Qwen3-VL-8B-Instruct` | `BANALCED_150K_META_VIT_PROXY/output-Qwen3-VL-8B-Instruct/v2-20260423-205317/checkpoint-95000` | |
|
|
|
|
| ``` |
| @misc{xu2026allegory, |
| title = {Allegory of the Cave: Measurement-Grounded Vision-Language Learning}, |
| author = {Xu, Kepeng and Xu, Li and He, Gang and Yu, Wenxin}, |
| year = {2026}, |
| eprint = {2605.11727}, |
| archivePrefix = {arXiv}, |
| url = {https://arxiv.org/abs/2605.11727} |
| } |
| ``` |
|
|
|
|
|
|
|
|