Erik commited on
Update README.md
Browse files
README.md
CHANGED
|
@@ -22,9 +22,9 @@ metrics:
|
|
| 22 |
- accuracy
|
| 23 |
---
|
| 24 |
|
| 25 |
-
# 💀
|
| 26 |
|
| 27 |
-
**
|
| 28 |
|
| 29 |
### 🚀 Model Details
|
| 30 |
- **Developed by:** Erik22TY
|
|
@@ -60,7 +60,7 @@ Nebulos was trained on a high-quality multilingual stream:
|
|
| 60 |
- **Final Loss:** 4.0898
|
| 61 |
|
| 62 |
### ⚠️ Limitations & Behavior
|
| 63 |
-
As a 125M parameter model trained for 500 steps,
|
| 64 |
- **Repetitions:** May occasionally loop phrases (e.g., "metic"). Use `repetition_penalty=1.5`.
|
| 65 |
- **Language Blending:** Due to its size, it may mix Romance languages (Spanish/French/Portuguese) in complex responses.
|
| 66 |
- **Coherence:** Best used for short-form explanations or creative experiments.
|
|
|
|
| 22 |
- accuracy
|
| 23 |
---
|
| 24 |
|
| 25 |
+
# 💀 SkullLLM-125M
|
| 26 |
|
| 27 |
+
**SkullLLM-125M** is a lightweight, experimental multilingual language model fine-tuned from GPT-2. This project, part of the **SkullLLM** series, demonstrates that AI training is possible on highly constrained consumer hardware (3GB VRAM) using advanced optimization techniques.
|
| 28 |
|
| 29 |
### 🚀 Model Details
|
| 30 |
- **Developed by:** Erik22TY
|
|
|
|
| 60 |
- **Final Loss:** 4.0898
|
| 61 |
|
| 62 |
### ⚠️ Limitations & Behavior
|
| 63 |
+
As a 125M parameter model trained for 500 steps, SkullLLM-125M is a **Proof of Concept**.
|
| 64 |
- **Repetitions:** May occasionally loop phrases (e.g., "metic"). Use `repetition_penalty=1.5`.
|
| 65 |
- **Language Blending:** Due to its size, it may mix Romance languages (Spanish/French/Portuguese) in complex responses.
|
| 66 |
- **Coherence:** Best used for short-form explanations or creative experiments.
|