Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
马路's picture
4 1 1

马路

RoadQAQ
John6666's profile picture
·

AI & ML interests

None yet

Organizations

OpenDCAI's profile picture

RoadQAQ 's collections 1

ReLIFT
ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone.
  • RoadQAQ/ReLIFT-Qwen2.5-7B-Zero

    Question Answering • 8B • Updated Jun 18 • 14 • 2
  • RoadQAQ/ReLIFT-Qwen2.5-Math-1.5B-Zero

    Question Answering • 2B • Updated Jun 12 • 149
  • RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero

    Question Answering • 8B • Updated Aug 27 • 421
  • Elliott/Openr1-Math-46k-8192

    Viewer • Updated Apr 23 • 45.8k • 736 • 8
ReLIFT
ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone.
  • RoadQAQ/ReLIFT-Qwen2.5-7B-Zero

    Question Answering • 8B • Updated Jun 18 • 14 • 2
  • RoadQAQ/ReLIFT-Qwen2.5-Math-1.5B-Zero

    Question Answering • 2B • Updated Jun 12 • 149
  • RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero

    Question Answering • 8B • Updated Aug 27 • 421
  • Elliott/Openr1-Math-46k-8192

    Viewer • Updated Apr 23 • 45.8k • 736 • 8
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs