--- license: apache-2.0 base_model: - Qwen/Qwen3-VL-8B-Thinking tags: - vlm - chart-understanding library_name: transformers --- # BiPS — Bi-directional Perceptual Shaping for Multimodal Reasoning This model card describes **BiPS (Bi-directional Perceptual Shaping)**, a **training-time** framework proposed in *“See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning”* **[CVPR 2026]**. - Paper: https://arxiv.org/abs/2512.22120 - Code: https://github.com/zss02/BiPS ## What is BiPS? Many VLMs fail on multimodal reasoning because they **look at the wrong visual evidence** (especially for charts, thin lines, intersections, and small regions). BiPS improves **question-conditioned visual grounding** by turning “where-to-look” supervision into training signals—**without requiring extra tools at inference time**. ## Key idea BiPS trains a VLM with two complementary view transformations: - **Evidence-Preserving View**: keep only the visual evidence needed to answer, reduce distractions. → enforce **consistency** between predictions from the original image and the preserved view. - **Evidence-Ablated View**: remove the key evidence so the image no longer supports the answer. → enforce **separation** so the model cannot rely on shortcuts. These constraints are typically implemented with **KL-based objectives** and can be integrated into **GRPO** training. ## Why it matters - Better **fine-grained evidence alignment** - Less “guessing” from language priors - **No additional inference overhead** (views are used only during training) ## How to use BiPS is mainly a **training recipe**. To apply it: 1. Follow the official repo to set up dependencies and scripts. 2. Train your base VLM with BiPS-generated **preserve/ablate** views. 3. Use the resulting checkpoint as a standard VLM at inference time (no extra steps). ## Citation ```bibtex @article{zhang2025bips, title={See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning}, author={Zhang, Shuoshuo and Zhang, Yizhen and Fu, Jingjing and Song, Lei and Bian, Jiang and Yang, Yujiu and Wang, Rui}, journal={arXiv preprint arXiv:2512.22120}, year={2025} } ```