Update README.md
Browse files
README.md
CHANGED
|
@@ -107,6 +107,7 @@ MiniAxion1 is not intended to compete with large-scale models. Instead, it is bu
|
|
| 107 |
* Susceptible to incorrect intermediate reasoning steps
|
| 108 |
* Limited generalization beyond trained patterns
|
| 109 |
* Not suitable for production use in critical systems
|
|
|
|
| 110 |
|
| 111 |
---
|
| 112 |
|
|
@@ -139,12 +140,14 @@ This model provides early evidence that:
|
|
| 139 |
* Scaling parameters (1M → 10M range)
|
| 140 |
* Better coupling between reasoning and answers
|
| 141 |
* Task-specific specialization (e.g., math-only variants)
|
|
|
|
| 142 |
|
| 143 |
---
|
| 144 |
|
| 145 |
## 🤝 Acknowledgments
|
| 146 |
|
| 147 |
This model was developed as part of ongoing experimentation in nano-scale reasoning systems.
|
|
|
|
| 148 |
|
| 149 |
---
|
| 150 |
|
|
|
|
| 107 |
* Susceptible to incorrect intermediate reasoning steps
|
| 108 |
* Limited generalization beyond trained patterns
|
| 109 |
* Not suitable for production use in critical systems
|
| 110 |
+
* Due to 920k parameters, low results on evaluation is expected
|
| 111 |
|
| 112 |
---
|
| 113 |
|
|
|
|
| 140 |
* Scaling parameters (1M → 10M range)
|
| 141 |
* Better coupling between reasoning and answers
|
| 142 |
* Task-specific specialization (e.g., math-only variants)
|
| 143 |
+
* distillation knowledge on bigger models
|
| 144 |
|
| 145 |
---
|
| 146 |
|
| 147 |
## 🤝 Acknowledgments
|
| 148 |
|
| 149 |
This model was developed as part of ongoing experimentation in nano-scale reasoning systems.
|
| 150 |
+
the main question was: "How low could a model think(or mimic it)?
|
| 151 |
|
| 152 |
---
|
| 153 |
|