Update README.md
Browse files
README.md
CHANGED
|
@@ -23,21 +23,22 @@ This special edition went thru UNA at MLP layers just like [miniclaus-1.5B](http
|
|
| 23 |
|
| 24 |
Here we use our novel approach called `MGS`. Its up to you to figure out what it means. On top of that we used `UNA: Uniform Neural Alignment`
|
| 25 |
|
| 26 |
-
Cybertron V4 went thru SFT with MGS & UNA over `Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1`
|
| 27 |
|
| 28 |
## Quantz
|
| 29 |
Soon..
|
| 30 |
|
| 31 |
-
## MGS & UNA
|
| 32 |
-
Being fair:
|
| 33 |
|
| 34 |
-
|
|
|
|
| 35 |
|
| 36 |
-
|
| 37 |
-
UNA, among other things.. orthogonal approach for neural uniformit. `1+1 = 2 obviously`
|
| 38 |
|
| 39 |
## Training procedure
|
|
|
|
| 40 |
1 Epoch as usual.
|
|
|
|
| 41 |
[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
|
| 42 |
```
|
| 43 |
datasets:
|
|
|
|
| 23 |
|
| 24 |
Here we use our novel approach called `MGS`. Its up to you to figure out what it means. On top of that we used `UNA: Uniform Neural Alignment`
|
| 25 |
|
| 26 |
+
Cybertron V4 went thru SFT with `MGS & UNA` over `Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1` dataset.
|
| 27 |
|
| 28 |
## Quantz
|
| 29 |
Soon..
|
| 30 |
|
| 31 |
+
## MGS & UNA & Details
|
|
|
|
| 32 |
|
| 33 |
+
* MGS, among other things.. a strategy of tackling corpora forgetful. `1+1 = 2 and not 3`
|
| 34 |
+
* UNA, among other things.. orthogonal approach for neural uniformit. `1+1 = 2 obviously`
|
| 35 |
|
| 36 |
+
We also followed https://arxiv.org/pdf/2410.21228 insights.
|
|
|
|
| 37 |
|
| 38 |
## Training procedure
|
| 39 |
+
|
| 40 |
1 Epoch as usual.
|
| 41 |
+
|
| 42 |
[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
|
| 43 |
```
|
| 44 |
datasets:
|