s1-m_7b_beta / README.md
XuyaoWang's picture
Update README.md
59f3561 verified
metadata
language:
  - en
license: cc-by-nc-4.0
pipeline_tag: image-text-to-text
tags:
  - multimodal
base_model:
  - Qwen/Qwen2-VL-7B-Instruct

S1-M-7B-Beta

🏠 Homepage | 👍 Our Official Code Repo | 🤗 S1-M Dataset (Beta)

S1-M-7B-Beta used for developing the algorithm "Simple Test-time Scaling in Multimodal Reasoning". By fine-tuning the base model Qwen/Qwen2-VL-7B-Instruct on data with thinking tags <think> and </think>, the model acquired the think first, then response paradigm, allowing for experiments on "Test-time Scaling".

Note: The current model is a development version, not the final official version.