--- library_name: transformers license: cc datasets: - array/SAT - multimodal-reasoning-lab/Zebra-CoT - Video-R1/Video-R1-data base_model: - Qwen/Qwen2.5-VL-7B-Instruct --- - **Repository:** [https://github.com/arijitray1993/mull-tokens] - **Paper:** [https://arxiv.org/abs/2512.10941] ## How to Get Started with the Model WORK IN PROGRESS: more details to be added soon! It is highly recommended to install this version of transformers: https://github.com/arijitray1993/Mirage ``` git clone https://github.com/arijitray1993/Mirage pip install -e ./transformers/. ``` Next, clone this repo: https://github.com/arijitray1993/mull-tokens. We use a custom Qwen2.5 VL model. There is no change to the architecture, just some new tokens added. ``` % pip install qwen-vl-utils[decord]==0.0.8 import importlib from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor Qwen2_5_VLForConditionalGeneration = importlib.import_module( 'models.mmlatentdiscrete_qwen_vl' ).Qwen2_5_VLForConditionalGeneration model = Qwen2_5_VLForConditionalGeneration.from_pretrained("array/Qwen2.5-VL-MullGRPO") processor = AutoProcessor.from_pretrained( "array/Qwen2.5-VL-MullGRPO", trust_remote_code=True ) ``` ## Citation [optional] ``` @misc{ray2025mulltokensmodalityagnosticlatentthinking, title={Mull-Tokens: Modality-Agnostic Latent Thinking}, author={Arijit Ray and Ahmed Abdelkader and Chengzhi Mao and Bryan A. Plummer and Kate Saenko and Ranjay Krishna and Leonidas Guibas and Wen-Sheng Chu}, year={2025}, eprint={2512.10941}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2512.10941}, } ```