π Dragon-Translater-v1: The 2000it Sibling
Welcome to the Dragon-Translater family! This model is a 263M parameter Transformer-based architecture trained by on a i5-10210U in the "Dragon Lab."
π Model Status: 2000 Iterations
This checkpoint represents the first official stable release of the Dragon-Translater family. After over 83 hours of training, the model has achieved significant convergence.
- Iteration Count: 2,000 / 46,875
- Current Loss: ~14.45
- Training Time: ~83 Hours
- Sibling Version: v1.0 (The First-Born)
π» The "Dragon Lab" Hardware
This model was trained using a home-engineering setup on a budget:
- Training PC: ThinkPad i5-10210U (i5-10210U @ 70Β°C)
- OS: Windows (Python 3.13)
- Next-Gen Server: Dell OptiPlex 3020 SFF ($50 Build Project)
π Training Progress
The "Dragon" has been learning by adjusting 263 million mathematical "knobs" (parameters). We saw the Loss drop from a chaotic 172.3 down to a stable 14.4.
| Step | Loss | Learning Rate | Grad Norm |
|---|---|---|---|
| 1 | 172.39 | 9.9e-06 | 377.27 |
| 1000 | ~45.20 | 3.5e-05 | ~50.40 |
| 2000 | 14.45 | 4.8e-05 | 3.37 |
Warning
The "Dragon" has only been trained on 2000 iterations so the results could be quite bad.
π οΈ How to Use
You can load this model directly using the transformers library:
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
# Load the 2000it sibling
model = AutoModelForSeq2SeqLM.from_pretrained("MightyDragon-Dev/dragon-translater-v1")
tokenizer = AutoTokenizer.from_pretrained("MightyDragon-Dev/dragon-translater-v1")
- Downloads last month
- 44