🐉 Dragon-Translater-v1: The 2000it Sibling

Welcome to the Dragon-Translater family! This model is a 263M parameter Transformer-based architecture trained by on a i5-10210U in the "Dragon Lab."

🚀 Model Status: 2000 Iterations

This checkpoint represents the first official stable release of the Dragon-Translater family. After over 83 hours of training, the model has achieved significant convergence.

Iteration Count: 2,000 / 46,875
Current Loss: ~14.45
Training Time: ~83 Hours
Sibling Version: v1.0 (The First-Born)

💻 The "Dragon Lab" Hardware

This model was trained using a home-engineering setup on a budget:

Training PC: ThinkPad i5-10210U (i5-10210U @ 70°C)
OS: Windows (Python 3.13)
Next-Gen Server: Dell OptiPlex 3020 SFF ($50 Build Project)

📊 Training Progress

The "Dragon" has been learning by adjusting 263 million mathematical "knobs" (parameters). We saw the Loss drop from a chaotic 172.3 down to a stable 14.4.

Step	Loss	Learning Rate	Grad Norm
1	172.39	9.9e-06	377.27
1000	~45.20	3.5e-05	~50.40
2000	14.45	4.8e-05	3.37

Warning

The "Dragon" has only been trained on 2000 iterations so the results could be quite bad.

🛠️ How to Use

You can load this model directly using the transformers library:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

# Load the 2000it sibling
model = AutoModelForSeq2SeqLM.from_pretrained("MightyDragon-Dev/dragon-translater-v1")
tokenizer = AutoTokenizer.from_pretrained("MightyDragon-Dev/dragon-translater-v1")

Downloads last month: 44

Safetensors

Model size

0.3B params

Tensor type

F32