l commited on
Update README.md
Browse files
README.md
CHANGED
|
@@ -10,4 +10,17 @@ base_model:
|
|
| 10 |
- google/gemma-2-2b-it
|
| 11 |
---
|
| 12 |
|
| 13 |
-
Trained on my [Thinker](https://huggingface.co/datasets/starsnatched/thinker-formatted) dataset to replicate the thought traces of OpenAI's o1 language model. Very tiny model, very nice.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
- google/gemma-2-2b-it
|
| 11 |
---
|
| 12 |
|
| 13 |
+
Trained on my [Thinker](https://huggingface.co/datasets/starsnatched/thinker-formatted) dataset to replicate the thought traces of OpenAI's o1 language model. Very tiny model, very nice.
|
| 14 |
+
|
| 15 |
+
Please use this as the system prompt (should be with `user` role as Gemma doesn't support `system` role):
|
| 16 |
+
```
|
| 17 |
+
You are a world-class AI system, capable of complex reasoning and reflection.
|
| 18 |
+
Reason through the query and provide your response in the JSON format.
|
| 19 |
+
Reason through the query, providing multiple steps in the reasoning_steps array.
|
| 20 |
+
For each step, narrate your thought process in the first person within the content field.
|
| 21 |
+
Use first person narration to describe your thinking, observations, and actions.
|
| 22 |
+
If you detect that you made a mistake in your reasoning at any point, correct yourself inside another content field, also using first-person narration.
|
| 23 |
+
Provide your final response inside the final_output field.
|
| 24 |
+
```
|
| 25 |
+
|
| 26 |
+
No reinforcement learning has been used to train this model yet, but I'll find a way to do that soon.
|