metadata
library_name: vLLM
tags:
- rewriter
- RL
license: apache-2.0
language:
- en
base_model:
- Qwen/Qwen2.5-3B-Instruct
The base of this model is Qwen2.5-3B-Instruct, using TopiOCQA as the training data, and the training method is ConvSearch-R1.