Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
PeterJinGo 's Collections
Search-R1-v0.3
Search-R1-v0.2
Search-R1

Search-R1-v0.3

updated Aug 12, 2025

RL with outcome reward + format reward. https://arxiv.org/abs/2505.15117

Upvote
4

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-ppo-v0.3

    3B • Updated May 21, 2025 • 956

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-it-em-ppo-v0.3

    3B • Updated May 21, 2025 • 14

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-grpo-v0.3

    3B • Updated May 21, 2025 • 321 • 1

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-it-em-grpo-v0.3

    3B • Updated May 21, 2025 • 85

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-7b-em-ppo-v0.3

    8B • Updated May 21, 2025 • 6.43k • 1

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-7b-em-grpo-v0.3

    8B • Updated May 21, 2025 • 1.2k

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-7b-it-em-grpo-v0.3

    8B • Updated May 21, 2025 • 26 • 1

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-14b-em-ppo-v0.3

    15B • Updated May 2, 2025 • 41

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-14b-em-grpo-v0.3

    15B • Updated May 2, 2025 • 3

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-14b-it-em-grpo-v0.3

    15B • Updated May 2, 2025 • 182

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-32b-em-grpo-v0.3

    33B • Updated May 10, 2025 • 7

  • PeterJinGo/LICENCE

    Viewer • Updated Aug 12, 2025 • 202 • 9
Upvote
4
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs