Vjeong Claude Opus 4.6 commited on
Commit
a02e949
Β·
1 Parent(s): 83fc1b9

Tighten expected loss ranges for FineWeb-Edu dataset

Browse files

FineWeb-Edu is higher quality filtered data, so the model can achieve
lower loss at the same token count compared to generic web corpora.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Files changed (1) hide show
  1. llm_lab/training/debugger.py +4 -4
llm_lab/training/debugger.py CHANGED
@@ -29,10 +29,10 @@ from llm_lab.config import TrainConfig
29
  # Constants
30
  # ═══════════════════════════════════════════════════════════════════
31
 
32
- # Normal convergence ranges for a 1B model trained on ~10B tokens
33
- _EXPECTED_TRAIN_LOSS = (2.8, 3.5)
34
- _EXPECTED_VAL_LOSS = (3.0, 3.8)
35
- _EXPECTED_VAL_PPL = (20, 45)
36
 
37
  # Status labels
38
  STATUS_NORMAL = "NORMAL"
 
29
  # Constants
30
  # ═══════════════════════════════════════════════════════════════════
31
 
32
+ # Normal convergence ranges for a 1B model trained on ~10B tokens (FineWeb-Edu)
33
+ _EXPECTED_TRAIN_LOSS = (2.5, 3.3)
34
+ _EXPECTED_VAL_LOSS = (2.7, 3.6)
35
+ _EXPECTED_VAL_PPL = (15, 37)
36
 
37
  # Status labels
38
  STATUS_NORMAL = "NORMAL"