Upload 4 files

Browse files

Files changed (4) hide show

report/chat-evaluation-rl.md +23 -0
report/chat-rl.md +22 -0
report/header.md +36 -0
report/report.md +96 -0

report/chat-evaluation-rl.md ADDED Viewed

	@@ -0,0 +1,23 @@

+## Chat evaluation rl
+timestamp: 2025-12-08 20:36:20
+- source: rl
+- task_name: None
+- dtype: bfloat16
+- temperature: 0.0000
+- max_new_tokens: 512
+- num_samples: 1
+- top_k: 50
+- batch_size: 8
+- model_tag: None
+- step: None
+- max_problems: None
+- device_type:
+- ARC-Easy: 0.7130
+- ARC-Challenge: 0.5375
+- MMLU: 0.4256
+- GSM8K: 0.2305
+- HumanEval: 0.0671
+- SpellingBee: 0.9922
+- ChatCORE metric: 0.4208

report/chat-rl.md ADDED Viewed

	@@ -0,0 +1,22 @@

+## Chat RL
+timestamp: 2025-12-08 20:02:09
+- run: d34_rl
+- source: sft
+- dtype: bfloat16
+- device_batch_size: 4
+- examples_per_step: 16
+- num_samples: 16
+- max_new_tokens: 256
+- temperature: 1.0000
+- top_k: 50
+- unembedding_lr: 0.0040
+- embedding_lr: 0.2000
+- matrix_lr: 0.0200
+- weight_decay: 0.0000
+- init_lr_frac: 0.0500
+- num_epochs: 1
+- save_every: 60
+- eval_every: 60
+- eval_examples: 400

report/header.md ADDED Viewed

	@@ -0,0 +1,36 @@

+# nanochat training report
+Generated: 2025-12-08 13:02:11
+## Environment
+### Git Information
+- Branch: pankaj_dev
+- Commit: 3289b19 (dirty)
+- Message: Adjust device_batch_size in run_d34_finetune.sh from 4 to 6 for mid-training and
+### Hardware
+- Platform: Linux
+- CPUs: 240 cores (240 logical)
+- Memory: 1771.7 GB
+- GPUs: 8x NVIDIA A100-SXM4-80GB
+- GPU Memory: 634.0 GB total
+- CUDA Version: 12.8
+- Hourly Rate: $14.32/hour
+### Software
+- Python: 3.10.12
+- PyTorch: 2.8.0+cu128
+### Bloat
+- Characters: 446,068
+- Lines: 10,895
+- Files: 53
+- Tokens (approx): 111,517
+- Dependencies (uv.lock lines): 2,218
+Run started: 2025-12-08 13:02:14
+---

report/report.md ADDED Viewed

	@@ -0,0 +1,96 @@

+# nanochat training report
+Generated: 2025-12-08 13:02:11
+## Environment
+### Git Information
+- Branch: pankaj_dev
+- Commit: 3289b19 (dirty)
+- Message: Adjust device_batch_size in run_d34_finetune.sh from 4 to 6 for mid-training and
+### Hardware
+- Platform: Linux
+- CPUs: 240 cores (240 logical)
+- Memory: 1771.7 GB
+- GPUs: 8x NVIDIA A100-SXM4-80GB
+- GPU Memory: 634.0 GB total
+- CUDA Version: 12.8
+- Hourly Rate: $14.32/hour
+### Software
+- Python: 3.10.12
+- PyTorch: 2.8.0+cu128
+### Bloat
+- Characters: 446,068
+- Lines: 10,895
+- Files: 53
+- Tokens (approx): 111,517
+- Dependencies (uv.lock lines): 2,218
+Run started: 2025-12-08 13:02:14
+---
+## Chat RL
+timestamp: 2025-12-08 20:02:09
+- run: d34_rl
+- source: sft
+- dtype: bfloat16
+- device_batch_size: 4
+- examples_per_step: 16
+- num_samples: 16
+- max_new_tokens: 256
+- temperature: 1.0000
+- top_k: 50
+- unembedding_lr: 0.0040
+- embedding_lr: 0.2000
+- matrix_lr: 0.0200
+- weight_decay: 0.0000
+- init_lr_frac: 0.0500
+- num_epochs: 1
+- save_every: 60
+- eval_every: 60
+- eval_examples: 400
+## Chat evaluation rl
+timestamp: 2025-12-08 20:36:20
+- source: rl
+- task_name: None
+- dtype: bfloat16
+- temperature: 0.0000
+- max_new_tokens: 512
+- num_samples: 1
+- top_k: 50
+- batch_size: 8
+- model_tag: None
+- step: None
+- max_problems: None
+- device_type:
+- ARC-Easy: 0.7130
+- ARC-Challenge: 0.5375
+- MMLU: 0.4256
+- GSM8K: 0.2305
+- HumanEval: 0.0671
+- SpellingBee: 0.9922
+- ChatCORE metric: 0.4208
+## Summary
+- Characters: 446,068
+- Lines: 10,895
+- Files: 53
+- Tokens (approx): 111,517
+- Dependencies (uv.lock lines): 2,218
+| Metric          | BASE     | MID      | SFT      | RL       |
+|-----------------|----------|----------|----------|----------|
+| GSM8K           | -        | -        | -        | 0.2305   |
+Total wall clock time: unknown