ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning
Paper
• 2505.15776 • Published
• 11
The base of this model is Qwen2.5-3B-Instruct, using TopiOCQA as the training data, and the training method is ConvSearch-R1.