pankajmathur commited on
Commit
52e26ac
·
verified ·
1 Parent(s): 7e2bd7f

Upload 4 files

Browse files
report/chat-evaluation-rl.md ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Chat evaluation rl
2
+ timestamp: 2025-12-08 20:36:20
3
+
4
+ - source: rl
5
+ - task_name: None
6
+ - dtype: bfloat16
7
+ - temperature: 0.0000
8
+ - max_new_tokens: 512
9
+ - num_samples: 1
10
+ - top_k: 50
11
+ - batch_size: 8
12
+ - model_tag: None
13
+ - step: None
14
+ - max_problems: None
15
+ - device_type:
16
+ - ARC-Easy: 0.7130
17
+ - ARC-Challenge: 0.5375
18
+ - MMLU: 0.4256
19
+ - GSM8K: 0.2305
20
+ - HumanEval: 0.0671
21
+ - SpellingBee: 0.9922
22
+ - ChatCORE metric: 0.4208
23
+
report/chat-rl.md ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Chat RL
2
+ timestamp: 2025-12-08 20:02:09
3
+
4
+ - run: d34_rl
5
+ - source: sft
6
+ - dtype: bfloat16
7
+ - device_batch_size: 4
8
+ - examples_per_step: 16
9
+ - num_samples: 16
10
+ - max_new_tokens: 256
11
+ - temperature: 1.0000
12
+ - top_k: 50
13
+ - unembedding_lr: 0.0040
14
+ - embedding_lr: 0.2000
15
+ - matrix_lr: 0.0200
16
+ - weight_decay: 0.0000
17
+ - init_lr_frac: 0.0500
18
+ - num_epochs: 1
19
+ - save_every: 60
20
+ - eval_every: 60
21
+ - eval_examples: 400
22
+
report/header.md ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # nanochat training report
2
+
3
+ Generated: 2025-12-08 13:02:11
4
+
5
+ ## Environment
6
+
7
+ ### Git Information
8
+ - Branch: pankaj_dev
9
+ - Commit: 3289b19 (dirty)
10
+ - Message: Adjust device_batch_size in run_d34_finetune.sh from 4 to 6 for mid-training and
11
+
12
+ ### Hardware
13
+ - Platform: Linux
14
+ - CPUs: 240 cores (240 logical)
15
+ - Memory: 1771.7 GB
16
+ - GPUs: 8x NVIDIA A100-SXM4-80GB
17
+ - GPU Memory: 634.0 GB total
18
+ - CUDA Version: 12.8
19
+ - Hourly Rate: $14.32/hour
20
+
21
+ ### Software
22
+ - Python: 3.10.12
23
+ - PyTorch: 2.8.0+cu128
24
+
25
+
26
+ ### Bloat
27
+ - Characters: 446,068
28
+ - Lines: 10,895
29
+ - Files: 53
30
+ - Tokens (approx): 111,517
31
+ - Dependencies (uv.lock lines): 2,218
32
+
33
+ Run started: 2025-12-08 13:02:14
34
+
35
+ ---
36
+
report/report.md ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # nanochat training report
2
+
3
+ Generated: 2025-12-08 13:02:11
4
+
5
+ ## Environment
6
+
7
+ ### Git Information
8
+ - Branch: pankaj_dev
9
+ - Commit: 3289b19 (dirty)
10
+ - Message: Adjust device_batch_size in run_d34_finetune.sh from 4 to 6 for mid-training and
11
+
12
+ ### Hardware
13
+ - Platform: Linux
14
+ - CPUs: 240 cores (240 logical)
15
+ - Memory: 1771.7 GB
16
+ - GPUs: 8x NVIDIA A100-SXM4-80GB
17
+ - GPU Memory: 634.0 GB total
18
+ - CUDA Version: 12.8
19
+ - Hourly Rate: $14.32/hour
20
+
21
+ ### Software
22
+ - Python: 3.10.12
23
+ - PyTorch: 2.8.0+cu128
24
+
25
+
26
+ ### Bloat
27
+ - Characters: 446,068
28
+ - Lines: 10,895
29
+ - Files: 53
30
+ - Tokens (approx): 111,517
31
+ - Dependencies (uv.lock lines): 2,218
32
+
33
+ Run started: 2025-12-08 13:02:14
34
+
35
+ ---
36
+
37
+ ## Chat RL
38
+ timestamp: 2025-12-08 20:02:09
39
+
40
+ - run: d34_rl
41
+ - source: sft
42
+ - dtype: bfloat16
43
+ - device_batch_size: 4
44
+ - examples_per_step: 16
45
+ - num_samples: 16
46
+ - max_new_tokens: 256
47
+ - temperature: 1.0000
48
+ - top_k: 50
49
+ - unembedding_lr: 0.0040
50
+ - embedding_lr: 0.2000
51
+ - matrix_lr: 0.0200
52
+ - weight_decay: 0.0000
53
+ - init_lr_frac: 0.0500
54
+ - num_epochs: 1
55
+ - save_every: 60
56
+ - eval_every: 60
57
+ - eval_examples: 400
58
+
59
+
60
+ ## Chat evaluation rl
61
+ timestamp: 2025-12-08 20:36:20
62
+
63
+ - source: rl
64
+ - task_name: None
65
+ - dtype: bfloat16
66
+ - temperature: 0.0000
67
+ - max_new_tokens: 512
68
+ - num_samples: 1
69
+ - top_k: 50
70
+ - batch_size: 8
71
+ - model_tag: None
72
+ - step: None
73
+ - max_problems: None
74
+ - device_type:
75
+ - ARC-Easy: 0.7130
76
+ - ARC-Challenge: 0.5375
77
+ - MMLU: 0.4256
78
+ - GSM8K: 0.2305
79
+ - HumanEval: 0.0671
80
+ - SpellingBee: 0.9922
81
+ - ChatCORE metric: 0.4208
82
+
83
+
84
+ ## Summary
85
+
86
+ - Characters: 446,068
87
+ - Lines: 10,895
88
+ - Files: 53
89
+ - Tokens (approx): 111,517
90
+ - Dependencies (uv.lock lines): 2,218
91
+
92
+ | Metric | BASE | MID | SFT | RL |
93
+ |-----------------|----------|----------|----------|----------|
94
+ | GSM8K | - | - | - | 0.2305 |
95
+
96
+ Total wall clock time: unknown