# Results: final_c6_18l448_factorized_aggressive Automatically generated after pretraining. ## Summary - Model: `18L / 7H / 448d` - Total parameters: `39600320` - Last logged train step: `92680` - Best validation loss: `3.4662` - Best validation perplexity: `32.01` - Last validation step: `92500` - Learning rate: `0.00056` - Effective tokens/update: `65536` ## Files - [Config snapshot](config_snapshot.json) - [Train metrics](train_metrics.jsonl) - [Eval metrics](eval_metrics.jsonl) - [Events](events.jsonl) - [Metrics plot](metrics.png) ## Metrics Plot ![Metrics plot](metrics.png)