Kaşgarlı Testi - Benchmark Results
Hypothesis
H1: Byte-level models learn agglutinative languages (Turkish) more efficiently than analytic languages (English).
Experimental Setup
- Model: AGIFORMER (identical architecture, 50M parameters)
- Hyperparameters: Same for both (d_model=512, n_layers=6, thinking_steps=3)
- Training: 5000 steps, batch_size=4, lr=3e-4
- English Dataset: enwik8 (100MB Wikipedia)
- Turkish Dataset: trwiki (Turkish Wikipedia)
Results
Final BPC (Lower is Better)
| Language | Validation BPC |
|---|---|
| English | 2.2578 |
| Turkish | 2.1226 |
Difference: 0.1352 BPC
Convergence Speed
Steps to reach BPC < 2.5:
- English: Not reached
- Turkish: 1550
Conclusion
Turkish model outperformed English, confirming the hypothesis.
Visualization
Generated: 2025-11-22
Experimenter: inkbytefo
