Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -7,4 +7,5 @@ sdk: static
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
+
🐙 OctoThinker, led by [GAIR](https://huggingface.co/GAIR), is an initiative to explore earlier training interventions that make base models more amenable to reinforcement learning (RL) scaling.
|
| 11 |
+
🎯 Our Goal: To reshape the pre-training trajectory so models scale better under RL.
|