Update README.md
Browse files
README.md
CHANGED
|
@@ -27,7 +27,11 @@ quantized_by: Suparious
|
|
| 27 |
- Model creator: [cognitivecomputations](https://huggingface.co/cognitivecomputations)
|
| 28 |
- Original model: [dolphin-2.9.4-gemma2-2b](https://huggingface.co/cognitivecomputations/dolphin-2.9.4-gemma2-2b)
|
| 29 |
|
|
|
|
| 30 |
|
|
|
|
|
|
|
|
|
|
| 31 |
|
| 32 |
## How to use
|
| 33 |
|
|
|
|
| 27 |
- Model creator: [cognitivecomputations](https://huggingface.co/cognitivecomputations)
|
| 28 |
- Original model: [dolphin-2.9.4-gemma2-2b](https://huggingface.co/cognitivecomputations/dolphin-2.9.4-gemma2-2b)
|
| 29 |
|
| 30 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/ldkN1J0WIDQwU4vutGYiD.png" width="600" />
|
| 31 |
|
| 32 |
+
This one is special because I used [GrokAdamW](https://github.com/cognitivecomputations/grokadamw) and [Liger Kernel](https://github.com/linkedin/Liger-Kernel)
|
| 33 |
+
|
| 34 |
+
GrokAdamW is intended to enable fast Grokking, to increase generalization. (I am not certain this occurred because this checkpoint is 4 epochs, and it probabaly take more epochs to achieve grok.)
|
| 35 |
|
| 36 |
## How to use
|
| 37 |
|