Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
RoadMa's picture
4 1 3

RoadMa

RoadQAQ
John6666's profile picture
·

AI & ML interests

None yet

Recent Activity

liked a model about 7 hours ago
stepfun-ai/Step-3.5-Flash
liked a dataset about 7 hours ago
stepfun-ai/CF-Div2-Stepfun
updated a dataset about 1 month ago
RoadQAQ/sft_for_rl
View all activity

Organizations

OpenDCAI's profile picture

RoadQAQ 's collections 1

ReLIFT
ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone.
  • RoadQAQ/ReLIFT-Qwen2.5-7B-Zero

    Question Answering • 8B • Updated Jun 18, 2025 • 1 • 2
  • RoadQAQ/ReLIFT-Qwen2.5-Math-1.5B-Zero

    Question Answering • 2B • Updated Jun 12, 2025 • 43
  • RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero

    Question Answering • 8B • Updated Aug 27, 2025 • 50
  • Elliott/Openr1-Math-46k-8192

    Viewer • Updated Apr 23, 2025 • 45.8k • 338 • 9
ReLIFT
ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone.
  • RoadQAQ/ReLIFT-Qwen2.5-7B-Zero

    Question Answering • 8B • Updated Jun 18, 2025 • 1 • 2
  • RoadQAQ/ReLIFT-Qwen2.5-Math-1.5B-Zero

    Question Answering • 2B • Updated Jun 12, 2025 • 43
  • RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero

    Question Answering • 8B • Updated Aug 27, 2025 • 50
  • Elliott/Openr1-Math-46k-8192

    Viewer • Updated Apr 23, 2025 • 45.8k • 338 • 9
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs