fullrun / results.md
huiting tang
Add files using upload-large-folder tool
fad46a0 verified
# Results: final_c6_18l448_factorized_aggressive
Automatically generated after pretraining.
## Summary
- Model: `18L / 7H / 448d`
- Total parameters: `39600320`
- Last logged train step: `92680`
- Best validation loss: `3.4662`
- Best validation perplexity: `32.01`
- Last validation step: `92500`
- Learning rate: `0.00056`
- Effective tokens/update: `65536`
## Files
- [Config snapshot](config_snapshot.json)
- [Train metrics](train_metrics.jsonl)
- [Eval metrics](eval_metrics.jsonl)
- [Events](events.jsonl)
- [Metrics plot](metrics.png)
## Metrics Plot
![Metrics plot](metrics.png)