Text Generation
Safetensors
English
llama

Add model card

#1
by nielsr HF Staff - opened

This PR adds a model card for the model presented in OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling. It includes relevant metadata (library name, pipeline tag, license) and links to the paper and project page.

Cannot merge
This branch has merge conflicts in the following files:
  • README.md

Sign up or log in to comment