s1-m_7b_beta / README.md
XuyaoWang's picture
Update README.md
59f3561 verified
|
raw
history blame
778 Bytes
---
language:
- en
license: cc-by-nc-4.0
pipeline_tag: image-text-to-text
tags:
- multimodal
base_model:
- Qwen/Qwen2-VL-7B-Instruct
---
# S1-M-7B-Beta
[๐Ÿ  Homepage](https://github.com/PKU-Alignment/s1-m) | [๐Ÿ‘ Our Official Code Repo](https://github.com/PKU-Alignment/s1-m) | [๐Ÿค— S1-M Dataset (Beta)](https://huggingface.co/datasets/PKU-Alignment/s1-m_beta)
S1-M-7B-Beta used for developing the algorithm "Simple Test-time Scaling in Multimodal Reasoning". By fine-tuning the base model `Qwen/Qwen2-VL-7B-Instruct` on data with thinking tags `<think>` and `</think>`, the model acquired the `think first, then response` paradigm, allowing for experiments on "Test-time Scaling".
**Note: The current model is a development version, not the final official version.**