dchen0 commited on
Commit
8a7d52f
Β·
verified Β·
1 Parent(s): 4c8ee21

Upload logs/resnet_20260331_032012.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. logs/resnet_20260331_032012.log +1843 -0
logs/resnet_20260331_032012.log ADDED
@@ -0,0 +1,1843 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0
  0%| | 0.00/97.8M [00:00<?, ?B/s]
1
  9%|β–Š | 8.50M/97.8M [00:00<00:01, 89.1MB/s]
2
  19%|β–ˆβ–‰ | 18.4M/97.8M [00:00<00:00, 97.1MB/s]
3
  29%|β–ˆβ–ˆβ–‰ | 28.8M/97.8M [00:00<00:00, 102MB/s]
4
  40%|β–ˆβ–ˆβ–ˆβ–‰ | 38.8M/97.8M [00:00<00:00, 102MB/s]
5
  51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 50.2M/97.8M [00:00<00:00, 104MB/s]
6
  63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 61.8M/97.8M [00:00<00:00, 107MB/s]
7
  74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 72.2M/97.8M [00:00<00:00, 105MB/s]
8
  86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 83.9M/97.8M [00:00<00:00, 104MB/s]
9
  98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 95.4M/97.8M [00:00<00:00, 109MB/s]
 
 
 
 
 
 
10
  0%| | 0/1 [00:00<?, ?it/s]
 
11
  0%| | 0/1 [00:00<?, ?it/s]
12
 
 
13
 
 
 
14
  
15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
17
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
18
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
19
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
20
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
21
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
22
  ...point-1/model.safetensors: 0%| | 92.1kB / 94.3MB 
 
 
 
 
 
23
  ...t_model/model.safetensors: 1%| | 556kB / 94.3MB 
 
 
24
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
25
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
26
  ...point-1/model.safetensors: 0%| | 92.1kB / 94.3MB 
 
 
 
 
 
27
  ...t_model/model.safetensors: 1%| | 556kB / 94.3MB 
 
 
 
28
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
29
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
30
  ...point-1/model.safetensors: 0%| | 92.1kB / 94.3MB 
 
 
 
 
 
31
  ...t_model/model.safetensors: 1%| | 556kB / 94.3MB 
 
 
32
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
33
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
34
  ...point-1/model.safetensors: 0%| | 184kB / 94.3MB 
 
 
 
 
 
35
  ...t_model/model.safetensors: 8%|β–Š | 7.36MB / 94.3MB 
 
 
 
36
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
37
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
38
  ...point-1/model.safetensors: 1%| | 552kB / 94.3MB 
 
 
 
 
 
39
  ...t_model/model.safetensors: 10%|β–ˆ | 9.58MB / 94.3MB 
 
 
 
40
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
41
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
42
  ...point-1/model.safetensors: 1%| | 829kB / 94.3MB 
 
 
 
 
 
43
  ...t_model/model.safetensors: 12%|β–ˆβ– | 11.2MB / 94.3MB 
 
 
 
44
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
45
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
46
  ...point-1/model.safetensors: 1%|▏ | 1.38MB / 94.3MB 
 
 
 
 
 
47
  ...t_model/model.safetensors: 15%|β–ˆβ–Œ | 14.6MB / 94.3MB 
 
 
 
 
 
 
 
48
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
49
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
50
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
51
  ...point-1/model.safetensors: 3%|β–Ž | 2.39MB / 94.3MB 
 
 
 
 
 
52
  ...t_model/model.safetensors: 22%|β–ˆβ–ˆβ– | 20.7MB / 94.3MB 
 
 
 
 
 
 
53
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
54
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
55
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
56
  ...point-1/model.safetensors: 3%|β–Ž | 3.04MB / 94.3MB 
 
 
 
 
 
57
  ...t_model/model.safetensors: 26%|β–ˆβ–ˆβ–Œ | 24.6MB / 94.3MB 
 
 
 
 
 
 
58
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
59
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
60
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
61
  ...point-1/model.safetensors: 4%|▍ | 4.23MB / 94.3MB 
 
 
 
 
 
62
  ...t_model/model.safetensors: 34%|β–ˆβ–ˆβ–ˆβ–Ž | 31.8MB / 94.3MB 
 
 
 
 
 
 
63
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
64
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
65
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
66
  ...point-1/model.safetensors: 5%|β–Œ | 4.79MB / 94.3MB 
 
 
 
 
 
67
  ...t_model/model.safetensors: 37%|β–ˆβ–ˆβ–ˆβ–‹ | 35.2MB / 94.3MB 
 
 
 
 
 
 
68
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
69
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
70
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
71
  ...point-1/model.safetensors: 6%|β–‹ | 5.98MB / 94.3MB 
 
 
 
 
 
72
  ...t_model/model.safetensors: 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 42.4MB / 94.3MB 
 
 
 
 
 
 
73
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
74
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
75
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
76
  ...point-1/model.safetensors: 7%|β–‹ | 6.63MB / 94.3MB 
 
 
 
 
 
77
  ...t_model/model.safetensors: 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 46.3MB / 94.3MB 
 
 
 
 
 
 
78
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
79
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
80
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
81
  ...point-1/model.safetensors: 8%|β–Š | 7.92MB / 94.3MB 
 
 
 
 
 
82
  ...t_model/model.safetensors: 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 54.1MB / 94.3MB 
 
 
 
 
 
 
83
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
84
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
85
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
86
  ...point-1/model.safetensors: 9%|β–‰ | 8.47MB / 94.3MB 
 
 
 
 
 
87
  ...t_model/model.safetensors: 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 57.4MB / 94.3MB 
 
 
 
 
 
 
88
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
89
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
90
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
91
  ...point-1/model.safetensors: 10%|β–ˆ | 9.57MB / 94.3MB 
 
 
 
 
 
92
  ...t_model/model.safetensors: 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 64.1MB / 94.3MB 
 
 
 
 
 
 
93
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
94
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
95
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
96
  ...point-1/model.safetensors: 11%|β–ˆ | 10.2MB / 94.3MB 
 
 
 
 
 
97
  ...t_model/model.safetensors: 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 68.0MB / 94.3MB 
 
 
 
 
 
 
98
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
99
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
100
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
101
  ...point-1/model.safetensors: 12%|β–ˆβ– | 11.0MB / 94.3MB 
 
 
 
 
 
102
  ...t_model/model.safetensors: 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 73.0MB / 94.3MB 
 
 
 
 
 
 
103
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
104
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
105
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆοΏ½οΏ½οΏ½β–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
106
  ...point-1/model.safetensors: 12%|β–ˆβ– | 11.0MB / 94.3MB 
 
 
 
 
 
107
  ...t_model/model.safetensors: 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 73.0MB / 94.3MB 
 
 
 
 
 
 
108
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
109
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
110
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
111
  ...point-1/model.safetensors: 12%|β–ˆβ– | 11.0MB / 94.3MB 
 
 
 
 
 
112
  ...t_model/model.safetensors: 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 73.0MB / 94.3MB 
 
 
 
 
 
 
113
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
114
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
115
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
116
  ...point-1/model.safetensors: 12%|β–ˆβ– | 11.0MB / 94.3MB 
 
 
 
 
 
117
  ...t_model/model.safetensors: 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 73.0MB / 94.3MB 
 
 
 
 
 
 
118
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
119
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
120
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
121
  ...point-1/model.safetensors: 12%|β–ˆβ– | 11.0MB / 94.3MB 
 
 
 
 
 
122
  ...t_model/model.safetensors: 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 73.0MB / 94.3MB 
 
 
 
 
 
 
123
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
124
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
125
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
126
  ...point-1/model.safetensors: 27%|β–ˆβ–ˆβ–‹ | 25.3MB / 94.3MB 
 
 
 
 
 
127
  ...t_model/model.safetensors: 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 73.3MB / 94.3MB 
 
 
 
 
 
 
128
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
129
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
130
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
131
  ...point-1/model.safetensors: 31%|β–ˆβ–ˆβ–ˆ | 29.2MB / 94.3MB 
 
 
 
 
 
132
  ...t_model/model.safetensors: 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 73.7MB / 94.3MB 
 
 
 
 
 
 
133
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
 
134
  ...checkpoint-1/optimizer.pt: 3%|β–Ž | 53.0B / 1.58kB 
 
 
 
 
 
 
 
 
135
  ...point-1/training_args.bin: 3%|β–Ž | 164B / 4.86kB 
 
 
 
 
 
 
 
 
 
136
  ...t_model/training_args.bin: 3%|β–Ž | 164B / 4.86kB 
 
 
 
 
 
 
 
 
 
 
137
  ...927196.dadb3a5bb633.549.0: 3%|β–Ž | 156B / 4.63kB 
 
 
138
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
139
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
140
  ...point-1/model.safetensors: 35%|β–ˆβ–ˆβ–ˆβ– | 32.6MB / 94.3MB 
 
 
 
 
 
141
  ...t_model/model.safetensors: 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 74.5MB / 94.3MB 
 
 
 
 
 
 
142
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
143
  ...checkpoint-1/optimizer.pt: 3%|β–Ž | 53.0B / 1.58kB 
 
 
 
 
 
 
 
 
144
  ...point-1/training_args.bin: 3%|β–Ž | 164B / 4.86kB 
 
 
 
 
 
 
 
 
 
145
  ...t_model/training_args.bin: 3%|β–Ž | 164B / 4.86kB 
 
 
 
 
 
 
 
 
 
 
146
  ...927196.dadb3a5bb633.549.0: 3%|β–Ž | 156B / 4.63kB 
 
 
 
147
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
148
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
149
  ...point-1/model.safetensors: 39%|β–ˆβ–ˆβ–ˆβ–Š | 36.5MB / 94.3MB 
 
 
 
 
 
150
  ...t_model/model.safetensors: 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 74.9MB / 94.3MB 
 
 
 
 
 
 
151
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
152
  ...checkpoint-1/optimizer.pt: 3%|β–Ž | 53.0B / 1.58kB 
 
 
 
 
 
 
 
 
153
  ...point-1/training_args.bin: 3%|β–Ž | 164B / 4.86kB 
 
 
 
 
 
 
 
 
 
154
  ...t_model/training_args.bin: 3%|β–Ž | 164B / 4.86kB 
 
 
 
 
 
 
 
 
 
 
155
  ...927196.dadb3a5bb633.549.0: 3%|β–Ž | 156B / 4.63kB 
 
 
 
156
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
157
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
158
  ...point-1/model.safetensors: 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 43.8MB / 94.3MB 
 
 
 
 
 
159
  ...t_model/model.safetensors: 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 76.5MB / 94.3MB 
 
 
 
 
 
 
160
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
161
  ...checkpoint-1/optimizer.pt: 10%|β–ˆ | 159B / 1.58kB 
 
 
 
 
 
 
 
 
162
  ...point-1/training_args.bin: 10%|β–ˆ | 492B / 4.86kB 
 
 
 
 
 
 
 
 
 
163
  ...t_model/training_args.bin: 10%|β–ˆ | 492B / 4.86kB 
 
 
 
 
 
 
 
 
 
 
164
  ...927196.dadb3a5bb633.549.0: 10%|β–ˆ | 469B / 4.63kB 
 
 
 
165
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
166
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
167
  ...point-1/model.safetensors: 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 48.1MB / 94.3MB 
 
 
 
 
 
168
  ...t_model/model.safetensors: 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 79.9MB / 94.3MB 
 
 
 
 
 
 
169
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
170
  ...checkpoint-1/optimizer.pt: 30%|β–ˆβ–ˆβ–ˆ | 479B / 1.58kB 
 
 
 
 
 
 
 
 
171
  ...point-1/training_args.bin: 30%|β–ˆβ–ˆβ–ˆ | 1.48kB / 4.86kB 
 
 
 
 
 
 
 
 
 
172
  ...t_model/training_args.bin: 30%|β–ˆβ–ˆβ–ˆ | 1.48kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
173
  ...927196.dadb3a5bb633.549.0: 30%|β–ˆβ–ˆβ–ˆ | 1.41kB / 4.63kB 
 
 
 
174
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
175
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
176
  ...point-1/model.safetensors: 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 55.5MB / 94.3MB 
 
 
 
 
 
177
  ...t_model/model.safetensors: 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 82.1MB / 94.3MB 
 
 
 
 
 
 
178
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
179
  ...checkpoint-1/optimizer.pt: 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 639B / 1.58kB 
 
 
 
 
 
 
 
 
180
  ...point-1/training_args.bin: 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 1.97kB / 4.86kB 
 
 
 
 
 
 
 
 
 
181
  ...t_model/training_args.bin: 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 1.97kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
182
  ...927196.dadb3a5bb633.549.0: 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 1.88kB / 4.63kB 
 
 
 
183
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
184
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
185
  ...point-1/model.safetensors: 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 59.9MB / 94.3MB 
 
 
 
 
 
186
  ...t_model/model.safetensors: 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 85.9MB / 94.3MB 
 
 
 
 
 
 
187
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
188
  ...checkpoint-1/optimizer.pt: 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1.01kB / 1.58kB 
 
 
 
 
 
 
 
 
189
  ...point-1/training_args.bin: 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 3.12kB / 4.86kB 
 
 
 
 
 
 
 
 
 
190
  ...t_model/training_args.bin: 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 3.12kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
191
  ...927196.dadb3a5bb633.549.0: 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 2.97kB / 4.63kB 
 
 
 
192
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
193
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
194
  ...point-1/model.safetensors: 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 67.5MB / 94.3MB 
 
 
 
 
 
195
  ...t_model/model.safetensors: 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 89.6MB / 94.3MB 
 
 
 
 
 
 
196
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
197
  ...checkpoint-1/optimizer.pt: 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1.33kB / 1.58kB 
 
 
 
 
 
 
 
 
198
  ...point-1/training_args.bin: 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 4.11kB / 4.86kB 
 
 
 
 
 
 
 
 
 
199
  ...t_model/training_args.bin: 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 4.11kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
200
  ...927196.dadb3a5bb633.549.0: 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 3.91kB / 4.63kB 
 
 
 
201
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
202
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
203
  ...point-1/model.safetensors: 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 71.7MB / 94.3MB 
 
 
 
 
 
204
  ...t_model/model.safetensors: 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 91.9MB / 94.3MB 
 
 
 
 
 
 
205
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
206
  ...checkpoint-1/optimizer.pt: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1.54kB / 1.58kB 
 
 
 
 
 
 
 
 
207
  ...point-1/training_args.bin: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.76kB / 4.86kB 
 
 
 
 
 
 
 
 
 
208
  ...t_model/training_args.bin: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.76kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
209
  ...927196.dadb3a5bb633.549.0: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.54kB / 4.63kB 
 
 
 
210
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
211
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
212
  ...point-1/model.safetensors: 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 78.9MB / 94.3MB 
 
 
 
 
 
213
  ...t_model/model.safetensors: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 92.6MB / 94.3MB 
 
 
 
 
 
 
214
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
215
  ...checkpoint-1/optimizer.pt: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1.54kB / 1.58kB 
 
 
 
 
 
 
 
 
216
  ...point-1/training_args.bin: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.76kB / 4.86kB 
 
 
 
 
 
 
 
 
 
217
  ...t_model/training_args.bin: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.76kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
218
  ...927196.dadb3a5bb633.549.0: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.54kB / 4.63kB 
 
 
 
219
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
220
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
221
  ...point-1/model.safetensors: 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 82.8MB / 94.3MB 
 
 
 
 
 
222
  ...t_model/model.safetensors: 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 93.0MB / 94.3MB 
 
 
 
 
 
 
223
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
224
  ...checkpoint-1/optimizer.pt: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1.54kB / 1.58kB 
 
 
 
 
 
 
 
 
225
  ...point-1/training_args.bin: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.76kB / 4.86kB 
 
 
 
 
 
 
 
 
 
226
  ...t_model/training_args.bin: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.76kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
227
  ...927196.dadb3a5bb633.549.0: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.54kB / 4.63kB 
 
 
 
228
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
229
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
230
  ...point-1/model.safetensors: 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 90.6MB / 94.3MB 
 
 
 
 
 
231
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 94.0MB / 94.3MB 
 
 
 
 
 
 
232
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
233
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
 
 
 
 
 
 
 
 
234
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
235
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
236
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
 
 
 
237
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
238
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
239
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 93.9MB / 94.3MB 
 
 
 
 
 
240
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 94.3MB / 94.3MB 
 
 
 
 
 
 
241
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
242
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
 
 
 
 
 
 
 
 
243
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
244
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
245
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
 
 
 
246
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
247
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
248
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 93.9MB / 94.3MB 
 
 
 
 
 
249
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 94.3MB / 94.3MB 
 
 
 
 
 
 
250
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
251
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
 
 
 
 
 
 
 
 
252
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
253
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
254
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
 
 
255
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
256
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
257
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 93.9MB / 94.3MB 
 
 
 
 
 
258
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 94.3MB / 94.3MB 
 
 
 
 
 
 
259
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
260
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
 
 
 
 
 
 
 
 
261
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
262
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
263
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
 
 
264
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
265
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
266
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 93.9MB / 94.3MB 
 
 
 
 
 
267
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 94.3MB / 94.3MB 
 
 
 
 
 
 
268
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
269
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
 
 
 
 
 
 
 
 
270
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
271
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
272
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
 
 
273
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
274
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
275
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 93.9MB / 94.3MB 
 
 
 
 
 
276
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 94.3MB / 94.3MB 
 
 
 
 
 
 
277
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
278
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
 
 
 
 
 
 
 
 
279
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
280
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
281
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
 
 
282
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
283
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
284
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
 
 
 
 
 
285
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
 
 
 
 
 
 
286
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
287
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
 
 
 
 
 
 
 
 
288
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
289
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
290
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
 
 
 
291
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
292
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
293
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
 
 
 
 
 
294
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
 
 
 
 
 
 
295
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
296
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
 
 
 
 
 
 
 
 
297
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
298
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
299
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
 
 
300
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
301
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
302
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
 
 
 
 
 
303
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
 
 
 
 
 
 
304
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
305
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
 
 
 
 
 
 
 
 
306
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
307
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
308
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
 
 
309
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
310
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
311
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
 
 
 
 
 
312
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
 
 
 
 
 
 
313
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
314
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
 
 
 
 
 
 
 
 
315
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
316
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
317
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
 
 
318
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
319
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
320
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
 
 
 
 
 
321
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
 
 
 
 
 
 
322
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
323
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
 
 
 
 
 
 
 
 
324
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
325
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
326
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
 
 
327
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
328
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
329
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
 
 
 
 
 
330
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
 
 
 
 
 
 
331
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
332
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
 
 
 
 
 
 
 
 
333
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
334
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
335
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
 
 
336
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
 
 
 
337
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
 
 
 
 
338
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
 
 
 
 
 
339
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
 
 
 
 
 
 
340
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
 
 
 
 
 
 
 
341
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
 
 
 
 
 
 
 
 
342
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
343
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
 
 
 
 
 
 
 
 
 
 
344
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
 
 
345
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B
 
346
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB
 
347
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB
 
348
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB
 
349
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB
 
350
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB
 
351
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB
 
352
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB
 
353
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB
 
 
 
1
+ ==> Checking internet connectivity...
2
+ ==> Internet + pip OK
3
+ ==> Installing dependencies
4
+ ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
5
+ torchaudio 2.5.1+cu124 requires torch==2.5.1, but you have torch 2.6.0+cu124 which is incompatible.
6
+ WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.
7
+ WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.
8
+ ==> CUDA OK (torch 2.6.0+cu124, CUDA 12.4, NVIDIA GeForce RTX 3090)
9
+ ==> System info:
10
+ GPU: NVIDIA GeForce RTX 3090, 24576 MiB, 565.57.01
11
+ RAM: 125Gi
12
+ CPU: 32 cores
13
+ Disk: 50G total, 46G free
14
+ ==> Cloning font-model repo
15
+ Cloning into 'font-model'...
16
+ ==> Downloading dataset from HuggingFace: dchen0/font_crops_test
17
+
18
+ ==> Extracting data/train.tar...
19
+ ==> Extracting data/test.tar...
20
+ ==> Dataset ready: 3 train variants, 3 test variants
21
+ overlay 50G 4.1G 46G 9% /
22
+
23
+ ============================================
24
+ Training: resnet50 (GPUs: 1)
25
+ ============================================
26
+ 2026-03-31 03:19:53 - INFO - Loading dataset from data - train_model.py:163
27
+ 2026-03-31 03:19:53 - INFO - Found 3 labels - train_model.py:167
28
+ 2026-03-31 03:19:53 - INFO - Setting up image processor and augmentations - train_model.py:177
29
+ 2026-03-31 03:19:53 - INFO - HTTP Request: HEAD https://huggingface.co/facebook/dinov2-base-imagenet1k-1-layer/resolve/main/processor_config.json "HTTP/1.1 404 Not Found" - _client.py:1025
30
+ 2026-03-31 03:19:53 - INFO - HTTP Request: HEAD https://huggingface.co/facebook/dinov2-base-imagenet1k-1-layer/resolve/main/preprocessor_config.json "HTTP/1.1 307 Temporary Redirect" - _client.py:1025
31
+ 2026-03-31 03:19:53 - INFO - HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/facebook/dinov2-base-imagenet1k-1-layer/f9305d2c8048bd65783f64fabfa25429d13cbdbb/preprocessor_config.json "HTTP/1.1 200 OK" - _client.py:1025
32
+ 2026-03-31 03:19:53 - INFO - HTTP Request: GET https://huggingface.co/api/resolve-cache/models/facebook/dinov2-base-imagenet1k-1-layer/f9305d2c8048bd65783f64fabfa25429d13cbdbb/preprocessor_config.json "HTTP/1.1 200 OK" - _client.py:1025
33
+ 2026-03-31 03:19:53 - INFO - HTTP Request: HEAD https://huggingface.co/facebook/dinov2-base-imagenet1k-1-layer/resolve/main/processor_config.json "HTTP/1.1 404 Not Found" - _client.py:1025
34
+ 2026-03-31 03:19:53 - INFO - HTTP Request: HEAD https://huggingface.co/facebook/dinov2-base-imagenet1k-1-layer/resolve/main/preprocessor_config.json "HTTP/1.1 307 Temporary Redirect" - _client.py:1025
35
+ 2026-03-31 03:19:53 - INFO - HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/facebook/dinov2-base-imagenet1k-1-layer/f9305d2c8048bd65783f64fabfa25429d13cbdbb/preprocessor_config.json "HTTP/1.1 200 OK" - _client.py:1025
36
+ 2026-03-31 03:19:54 - INFO - HTTP Request: HEAD https://s3.amazonaws.com/datasets.huggingface.co/datasets/datasets/imagefolder/imagefolder.py "HTTP/1.1 404 Not Found" - _client.py:1025
37
+
38
+
39
+
40
+ 2026-03-31 03:19:54 - INFO - Train size: 30, Validation size: 9 - train_model.py:187
41
+ 2026-03-31 03:19:54 - INFO - Applying data transformations - train_model.py:191
42
+
43
+
44
+ 2026-03-31 03:19:54 - INFO - Data preprocessing complete - train_model.py:207
45
+ 2026-03-31 03:19:54 - INFO - Loading ResNet-50 (ImageNet-pretrained) as CNN baseline - train_model.py:211
46
+ Downloading: "https://download.pytorch.org/models/resnet50-11ad3fa6.pth" to /root/.cache/torch/hub/checkpoints/resnet50-11ad3fa6.pth
47
+
48
  0%| | 0.00/97.8M [00:00<?, ?B/s]
49
  9%|β–Š | 8.50M/97.8M [00:00<00:01, 89.1MB/s]
50
  19%|β–ˆβ–‰ | 18.4M/97.8M [00:00<00:00, 97.1MB/s]
51
  29%|β–ˆβ–ˆβ–‰ | 28.8M/97.8M [00:00<00:00, 102MB/s]
52
  40%|β–ˆβ–ˆβ–ˆβ–‰ | 38.8M/97.8M [00:00<00:00, 102MB/s]
53
  51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 50.2M/97.8M [00:00<00:00, 104MB/s]
54
  63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 61.8M/97.8M [00:00<00:00, 107MB/s]
55
  74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 72.2M/97.8M [00:00<00:00, 105MB/s]
56
  86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 83.9M/97.8M [00:00<00:00, 104MB/s]
57
  98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 95.4M/97.8M [00:00<00:00, 109MB/s]
58
+ 2026-03-31 03:19:56 - INFO - trainable params: 23,514,179 || all params: 23,514,179 || trainable%: 100.0000 - train_model.py:237
59
+ 2026-03-31 03:19:56 - INFO - Setting up training arguments - train_model.py:295
60
+ 2026-03-31 03:19:56 - INFO - Using device: cuda - train_model.py:298
61
+ `logging_dir` is deprecated and will be removed in v5.2. Please set `TENSORBOARD_LOGGING_DIR` instead.
62
+ 2026-03-31 03:19:56 - INFO - Starting training - train_model.py:337
63
+
64
  0%| | 0/1 [00:00<?, ?it/s]
65
+
66
  0%| | 0/1 [00:00<?, ?it/s]
67
 
68
+
69
 
70
+
71
+
72
  
73
 
74
+ 2026-03-31 03:19:59 - INFO - Training complete - train_model.py:343
75
+ 2026-03-31 03:19:59 - INFO - Saving result model to the output directory - train_model.py:350
76
+ {'eval_loss': '1.079', 'eval_accuracy': '0.3333', 'eval_runtime': '0.5487', 'eval_samples_per_second': '16.4', 'eval_steps_per_second': '1.823', 'epoch': '1'}
77
+ {'train_runtime': '2.249', 'train_samples_per_second': '13.34', 'train_steps_per_second': '0.445', 'train_loss': '1.062', 'epoch': '1'}
78
+ ==> Finished: resnet50
79
+
80
+ ============================================
81
+ ALL TRAINING COMPLETE
82
+ Results in: /workspace/output/
83
+ ============================================
84
+ ==> Uploading results to HuggingFace: dchen0/font-model-dry-run
85
+
86
+
87
+
88
+
89
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
90
+
91
+
92
+
93
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
94
+
95
+
96
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
97
+
98
+
99
+
100
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
101
+
102
+
103
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
104
+
105
+
106
+
107
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
108
+
109
+
110
+
111
+
112
  ...point-1/model.safetensors: 0%| | 92.1kB / 94.3MB 
113
+
114
+
115
+
116
+
117
+
118
  ...t_model/model.safetensors: 1%| | 556kB / 94.3MB 
119
+
120
+
121
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
122
+
123
+
124
+
125
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
126
+
127
+
128
+
129
+
130
  ...point-1/model.safetensors: 0%| | 92.1kB / 94.3MB 
131
+
132
+
133
+
134
+
135
+
136
  ...t_model/model.safetensors: 1%| | 556kB / 94.3MB 
137
+
138
+
139
+
140
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
141
+
142
+
143
+
144
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
145
+
146
+
147
+
148
+
149
  ...point-1/model.safetensors: 0%| | 92.1kB / 94.3MB 
150
+
151
+
152
+
153
+
154
+
155
  ...t_model/model.safetensors: 1%| | 556kB / 94.3MB 
156
+
157
+
158
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
159
+
160
+
161
+
162
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
163
+
164
+
165
+
166
+
167
  ...point-1/model.safetensors: 0%| | 184kB / 94.3MB 
168
+
169
+
170
+
171
+
172
+
173
  ...t_model/model.safetensors: 8%|β–Š | 7.36MB / 94.3MB 
174
+
175
+
176
+
177
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
178
+
179
+
180
+
181
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
182
+
183
+
184
+
185
+
186
  ...point-1/model.safetensors: 1%| | 552kB / 94.3MB 
187
+
188
+
189
+
190
+
191
+
192
  ...t_model/model.safetensors: 10%|β–ˆ | 9.58MB / 94.3MB 
193
+
194
+
195
+
196
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
197
+
198
+
199
+
200
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
201
+
202
+
203
+
204
+
205
  ...point-1/model.safetensors: 1%| | 829kB / 94.3MB 
206
+
207
+
208
+
209
+
210
+
211
  ...t_model/model.safetensors: 12%|β–ˆβ– | 11.2MB / 94.3MB 
212
+
213
+
214
+
215
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
216
+
217
+
218
+
219
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
220
+
221
+
222
+
223
+
224
  ...point-1/model.safetensors: 1%|▏ | 1.38MB / 94.3MB 
225
+
226
+
227
+
228
+
229
+
230
  ...t_model/model.safetensors: 15%|β–ˆβ–Œ | 14.6MB / 94.3MB 
231
+
232
+
233
+
234
+
235
+
236
+
237
+
238
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
239
+
240
+
241
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
242
+
243
+
244
+
245
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
246
+
247
+
248
+
249
+
250
  ...point-1/model.safetensors: 3%|β–Ž | 2.39MB / 94.3MB 
251
+
252
+
253
+
254
+
255
+
256
  ...t_model/model.safetensors: 22%|β–ˆβ–ˆβ– | 20.7MB / 94.3MB 
257
+
258
+
259
+
260
+
261
+
262
+
263
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
264
+
265
+
266
+
267
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
268
+
269
+
270
+
271
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
272
+
273
+
274
+
275
+
276
  ...point-1/model.safetensors: 3%|β–Ž | 3.04MB / 94.3MB 
277
+
278
+
279
+
280
+
281
+
282
  ...t_model/model.safetensors: 26%|β–ˆβ–ˆβ–Œ | 24.6MB / 94.3MB 
283
+
284
+
285
+
286
+
287
+
288
+
289
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
290
+
291
+
292
+
293
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
294
+
295
+
296
+
297
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
298
+
299
+
300
+
301
+
302
  ...point-1/model.safetensors: 4%|▍ | 4.23MB / 94.3MB 
303
+
304
+
305
+
306
+
307
+
308
  ...t_model/model.safetensors: 34%|β–ˆβ–ˆβ–ˆβ–Ž | 31.8MB / 94.3MB 
309
+
310
+
311
+
312
+
313
+
314
+
315
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
316
+
317
+
318
+
319
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
320
+
321
+
322
+
323
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
324
+
325
+
326
+
327
+
328
  ...point-1/model.safetensors: 5%|β–Œ | 4.79MB / 94.3MB 
329
+
330
+
331
+
332
+
333
+
334
  ...t_model/model.safetensors: 37%|β–ˆβ–ˆβ–ˆβ–‹ | 35.2MB / 94.3MB 
335
+
336
+
337
+
338
+
339
+
340
+
341
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
342
+
343
+
344
+
345
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
346
+
347
+
348
+
349
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
350
+
351
+
352
+
353
+
354
  ...point-1/model.safetensors: 6%|β–‹ | 5.98MB / 94.3MB 
355
+
356
+
357
+
358
+
359
+
360
  ...t_model/model.safetensors: 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 42.4MB / 94.3MB 
361
+
362
+
363
+
364
+
365
+
366
+
367
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
368
+
369
+
370
+
371
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
372
+
373
+
374
+
375
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
376
+
377
+
378
+
379
+
380
  ...point-1/model.safetensors: 7%|β–‹ | 6.63MB / 94.3MB 
381
+
382
+
383
+
384
+
385
+
386
  ...t_model/model.safetensors: 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 46.3MB / 94.3MB 
387
+
388
+
389
+
390
+
391
+
392
+
393
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
394
+
395
+
396
+
397
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
398
+
399
+
400
+
401
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
402
+
403
+
404
+
405
+
406
  ...point-1/model.safetensors: 8%|β–Š | 7.92MB / 94.3MB 
407
+
408
+
409
+
410
+
411
+
412
  ...t_model/model.safetensors: 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 54.1MB / 94.3MB 
413
+
414
+
415
+
416
+
417
+
418
+
419
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
420
+
421
+
422
+
423
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
424
+
425
+
426
+
427
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
428
+
429
+
430
+
431
+
432
  ...point-1/model.safetensors: 9%|β–‰ | 8.47MB / 94.3MB 
433
+
434
+
435
+
436
+
437
+
438
  ...t_model/model.safetensors: 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 57.4MB / 94.3MB 
439
+
440
+
441
+
442
+
443
+
444
+
445
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
446
+
447
+
448
+
449
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
450
+
451
+
452
+
453
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
454
+
455
+
456
+
457
+
458
  ...point-1/model.safetensors: 10%|β–ˆ | 9.57MB / 94.3MB 
459
+
460
+
461
+
462
+
463
+
464
  ...t_model/model.safetensors: 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 64.1MB / 94.3MB 
465
+
466
+
467
+
468
+
469
+
470
+
471
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
472
+
473
+
474
+
475
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
476
+
477
+
478
+
479
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
480
+
481
+
482
+
483
+
484
  ...point-1/model.safetensors: 11%|β–ˆ | 10.2MB / 94.3MB 
485
+
486
+
487
+
488
+
489
+
490
  ...t_model/model.safetensors: 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 68.0MB / 94.3MB 
491
+
492
+
493
+
494
+
495
+
496
+
497
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
498
+
499
+
500
+
501
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
502
+
503
+
504
+
505
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
506
+
507
+
508
+
509
+
510
  ...point-1/model.safetensors: 12%|β–ˆβ– | 11.0MB / 94.3MB 
511
+
512
+
513
+
514
+
515
+
516
  ...t_model/model.safetensors: 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 73.0MB / 94.3MB 
517
+
518
+
519
+
520
+
521
+
522
+
523
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
524
+
525
+
526
+
527
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
528
+
529
+
530
+
531
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆοΏ½οΏ½οΏ½β–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
532
+
533
+
534
+
535
+
536
  ...point-1/model.safetensors: 12%|β–ˆβ– | 11.0MB / 94.3MB 
537
+
538
+
539
+
540
+
541
+
542
  ...t_model/model.safetensors: 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 73.0MB / 94.3MB 
543
+
544
+
545
+
546
+
547
+
548
+
549
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
550
+
551
+
552
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
553
+
554
+
555
+
556
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
557
+
558
+
559
+
560
+
561
  ...point-1/model.safetensors: 12%|β–ˆβ– | 11.0MB / 94.3MB 
562
+
563
+
564
+
565
+
566
+
567
  ...t_model/model.safetensors: 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 73.0MB / 94.3MB 
568
+
569
+
570
+
571
+
572
+
573
+
574
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
575
+
576
+
577
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
578
+
579
+
580
+
581
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
582
+
583
+
584
+
585
+
586
  ...point-1/model.safetensors: 12%|β–ˆβ– | 11.0MB / 94.3MB 
587
+
588
+
589
+
590
+
591
+
592
  ...t_model/model.safetensors: 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 73.0MB / 94.3MB 
593
+
594
+
595
+
596
+
597
+
598
+
599
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
600
+
601
+
602
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
603
+
604
+
605
+
606
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
607
+
608
+
609
+
610
+
611
  ...point-1/model.safetensors: 12%|β–ˆβ– | 11.0MB / 94.3MB 
612
+
613
+
614
+
615
+
616
+
617
  ...t_model/model.safetensors: 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 73.0MB / 94.3MB 
618
+
619
+
620
+
621
+
622
+
623
+
624
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
625
+
626
+
627
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
628
+
629
+
630
+
631
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
632
+
633
+
634
+
635
+
636
  ...point-1/model.safetensors: 27%|β–ˆβ–ˆβ–‹ | 25.3MB / 94.3MB 
637
+
638
+
639
+
640
+
641
+
642
  ...t_model/model.safetensors: 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 73.3MB / 94.3MB 
643
+
644
+
645
+
646
+
647
+
648
+
649
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
650
+
651
+
652
+
653
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
654
+
655
+
656
+
657
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
658
+
659
+
660
+
661
+
662
  ...point-1/model.safetensors: 31%|β–ˆβ–ˆβ–ˆ | 29.2MB / 94.3MB 
663
+
664
+
665
+
666
+
667
+
668
  ...t_model/model.safetensors: 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 73.7MB / 94.3MB 
669
+
670
+
671
+
672
+
673
+
674
+
675
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
676
+
677
+
678
+
679
+
680
+
681
+
682
+
683
+
684
  ...checkpoint-1/optimizer.pt: 3%|β–Ž | 53.0B / 1.58kB 
685
+
686
+
687
+
688
+
689
+
690
+
691
+
692
+
693
  ...point-1/training_args.bin: 3%|β–Ž | 164B / 4.86kB 
694
+
695
+
696
+
697
+
698
+
699
+
700
+
701
+
702
+
703
  ...t_model/training_args.bin: 3%|β–Ž | 164B / 4.86kB 
704
+
705
+
706
+
707
+
708
+
709
+
710
+
711
+
712
+
713
+
714
  ...927196.dadb3a5bb633.549.0: 3%|β–Ž | 156B / 4.63kB 
715
+
716
+
717
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
718
+
719
+
720
+
721
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
722
+
723
+
724
+
725
+
726
  ...point-1/model.safetensors: 35%|β–ˆβ–ˆβ–ˆβ– | 32.6MB / 94.3MB 
727
+
728
+
729
+
730
+
731
+
732
  ...t_model/model.safetensors: 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 74.5MB / 94.3MB 
733
+
734
+
735
+
736
+
737
+
738
+
739
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
740
+
741
+
742
+
743
+
744
+
745
+
746
+
747
  ...checkpoint-1/optimizer.pt: 3%|β–Ž | 53.0B / 1.58kB 
748
+
749
+
750
+
751
+
752
+
753
+
754
+
755
+
756
  ...point-1/training_args.bin: 3%|β–Ž | 164B / 4.86kB 
757
+
758
+
759
+
760
+
761
+
762
+
763
+
764
+
765
+
766
  ...t_model/training_args.bin: 3%|β–Ž | 164B / 4.86kB 
767
+
768
+
769
+
770
+
771
+
772
+
773
+
774
+
775
+
776
+
777
  ...927196.dadb3a5bb633.549.0: 3%|β–Ž | 156B / 4.63kB 
778
+
779
+
780
+
781
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
782
+
783
+
784
+
785
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
786
+
787
+
788
+
789
+
790
  ...point-1/model.safetensors: 39%|β–ˆβ–ˆβ–ˆβ–Š | 36.5MB / 94.3MB 
791
+
792
+
793
+
794
+
795
+
796
  ...t_model/model.safetensors: 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 74.9MB / 94.3MB 
797
+
798
+
799
+
800
+
801
+
802
+
803
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
804
+
805
+
806
+
807
+
808
+
809
+
810
+
811
  ...checkpoint-1/optimizer.pt: 3%|β–Ž | 53.0B / 1.58kB 
812
+
813
+
814
+
815
+
816
+
817
+
818
+
819
+
820
  ...point-1/training_args.bin: 3%|β–Ž | 164B / 4.86kB 
821
+
822
+
823
+
824
+
825
+
826
+
827
+
828
+
829
+
830
  ...t_model/training_args.bin: 3%|β–Ž | 164B / 4.86kB 
831
+
832
+
833
+
834
+
835
+
836
+
837
+
838
+
839
+
840
+
841
  ...927196.dadb3a5bb633.549.0: 3%|β–Ž | 156B / 4.63kB 
842
+
843
+
844
+
845
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
846
+
847
+
848
+
849
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
850
+
851
+
852
+
853
+
854
  ...point-1/model.safetensors: 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 43.8MB / 94.3MB 
855
+
856
+
857
+
858
+
859
+
860
  ...t_model/model.safetensors: 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 76.5MB / 94.3MB 
861
+
862
+
863
+
864
+
865
+
866
+
867
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
868
+
869
+
870
+
871
+
872
+
873
+
874
+
875
  ...checkpoint-1/optimizer.pt: 10%|β–ˆ | 159B / 1.58kB 
876
+
877
+
878
+
879
+
880
+
881
+
882
+
883
+
884
  ...point-1/training_args.bin: 10%|β–ˆ | 492B / 4.86kB 
885
+
886
+
887
+
888
+
889
+
890
+
891
+
892
+
893
+
894
  ...t_model/training_args.bin: 10%|β–ˆ | 492B / 4.86kB 
895
+
896
+
897
+
898
+
899
+
900
+
901
+
902
+
903
+
904
+
905
  ...927196.dadb3a5bb633.549.0: 10%|β–ˆ | 469B / 4.63kB 
906
+
907
+
908
+
909
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
910
+
911
+
912
+
913
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
914
+
915
+
916
+
917
+
918
  ...point-1/model.safetensors: 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 48.1MB / 94.3MB 
919
+
920
+
921
+
922
+
923
+
924
  ...t_model/model.safetensors: 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 79.9MB / 94.3MB 
925
+
926
+
927
+
928
+
929
+
930
+
931
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
932
+
933
+
934
+
935
+
936
+
937
+
938
+
939
  ...checkpoint-1/optimizer.pt: 30%|β–ˆβ–ˆβ–ˆ | 479B / 1.58kB 
940
+
941
+
942
+
943
+
944
+
945
+
946
+
947
+
948
  ...point-1/training_args.bin: 30%|β–ˆβ–ˆβ–ˆ | 1.48kB / 4.86kB 
949
+
950
+
951
+
952
+
953
+
954
+
955
+
956
+
957
+
958
  ...t_model/training_args.bin: 30%|β–ˆβ–ˆβ–ˆ | 1.48kB / 4.86kB 
959
+
960
+
961
+
962
+
963
+
964
+
965
+
966
+
967
+
968
+
969
  ...927196.dadb3a5bb633.549.0: 30%|β–ˆβ–ˆβ–ˆ | 1.41kB / 4.63kB 
970
+
971
+
972
+
973
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
974
+
975
+
976
+
977
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
978
+
979
+
980
+
981
+
982
  ...point-1/model.safetensors: 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 55.5MB / 94.3MB 
983
+
984
+
985
+
986
+
987
+
988
  ...t_model/model.safetensors: 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 82.1MB / 94.3MB 
989
+
990
+
991
+
992
+
993
+
994
+
995
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
996
+
997
+
998
+
999
+
1000
+
1001
+
1002
+
1003
  ...checkpoint-1/optimizer.pt: 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 639B / 1.58kB 
1004
+
1005
+
1006
+
1007
+
1008
+
1009
+
1010
+
1011
+
1012
  ...point-1/training_args.bin: 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 1.97kB / 4.86kB 
1013
+
1014
+
1015
+
1016
+
1017
+
1018
+
1019
+
1020
+
1021
+
1022
  ...t_model/training_args.bin: 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 1.97kB / 4.86kB 
1023
+
1024
+
1025
+
1026
+
1027
+
1028
+
1029
+
1030
+
1031
+
1032
+
1033
  ...927196.dadb3a5bb633.549.0: 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 1.88kB / 4.63kB 
1034
+
1035
+
1036
+
1037
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
1038
+
1039
+
1040
+
1041
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
1042
+
1043
+
1044
+
1045
+
1046
  ...point-1/model.safetensors: 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 59.9MB / 94.3MB 
1047
+
1048
+
1049
+
1050
+
1051
+
1052
  ...t_model/model.safetensors: 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 85.9MB / 94.3MB 
1053
+
1054
+
1055
+
1056
+
1057
+
1058
+
1059
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
1060
+
1061
+
1062
+
1063
+
1064
+
1065
+
1066
+
1067
  ...checkpoint-1/optimizer.pt: 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1.01kB / 1.58kB 
1068
+
1069
+
1070
+
1071
+
1072
+
1073
+
1074
+
1075
+
1076
  ...point-1/training_args.bin: 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 3.12kB / 4.86kB 
1077
+
1078
+
1079
+
1080
+
1081
+
1082
+
1083
+
1084
+
1085
+
1086
  ...t_model/training_args.bin: 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 3.12kB / 4.86kB 
1087
+
1088
+
1089
+
1090
+
1091
+
1092
+
1093
+
1094
+
1095
+
1096
+
1097
  ...927196.dadb3a5bb633.549.0: 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 2.97kB / 4.63kB 
1098
+
1099
+
1100
+
1101
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
1102
+
1103
+
1104
+
1105
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
1106
+
1107
+
1108
+
1109
+
1110
  ...point-1/model.safetensors: 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 67.5MB / 94.3MB 
1111
+
1112
+
1113
+
1114
+
1115
+
1116
  ...t_model/model.safetensors: 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 89.6MB / 94.3MB 
1117
+
1118
+
1119
+
1120
+
1121
+
1122
+
1123
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
1124
+
1125
+
1126
+
1127
+
1128
+
1129
+
1130
+
1131
  ...checkpoint-1/optimizer.pt: 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1.33kB / 1.58kB 
1132
+
1133
+
1134
+
1135
+
1136
+
1137
+
1138
+
1139
+
1140
  ...point-1/training_args.bin: 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 4.11kB / 4.86kB 
1141
+
1142
+
1143
+
1144
+
1145
+
1146
+
1147
+
1148
+
1149
+
1150
  ...t_model/training_args.bin: 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 4.11kB / 4.86kB 
1151
+
1152
+
1153
+
1154
+
1155
+
1156
+
1157
+
1158
+
1159
+
1160
+
1161
  ...927196.dadb3a5bb633.549.0: 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 3.91kB / 4.63kB 
1162
+
1163
+
1164
+
1165
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
1166
+
1167
+
1168
+
1169
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
1170
+
1171
+
1172
+
1173
+
1174
  ...point-1/model.safetensors: 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 71.7MB / 94.3MB 
1175
+
1176
+
1177
+
1178
+
1179
+
1180
  ...t_model/model.safetensors: 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 91.9MB / 94.3MB 
1181
+
1182
+
1183
+
1184
+
1185
+
1186
+
1187
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
1188
+
1189
+
1190
+
1191
+
1192
+
1193
+
1194
+
1195
  ...checkpoint-1/optimizer.pt: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1.54kB / 1.58kB 
1196
+
1197
+
1198
+
1199
+
1200
+
1201
+
1202
+
1203
+
1204
  ...point-1/training_args.bin: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.76kB / 4.86kB 
1205
+
1206
+
1207
+
1208
+
1209
+
1210
+
1211
+
1212
+
1213
+
1214
  ...t_model/training_args.bin: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.76kB / 4.86kB 
1215
+
1216
+
1217
+
1218
+
1219
+
1220
+
1221
+
1222
+
1223
+
1224
+
1225
  ...927196.dadb3a5bb633.549.0: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.54kB / 4.63kB 
1226
+
1227
+
1228
+
1229
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
1230
+
1231
+
1232
+
1233
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
1234
+
1235
+
1236
+
1237
+
1238
  ...point-1/model.safetensors: 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 78.9MB / 94.3MB 
1239
+
1240
+
1241
+
1242
+
1243
+
1244
  ...t_model/model.safetensors: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 92.6MB / 94.3MB 
1245
+
1246
+
1247
+
1248
+
1249
+
1250
+
1251
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
1252
+
1253
+
1254
+
1255
+
1256
+
1257
+
1258
+
1259
  ...checkpoint-1/optimizer.pt: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1.54kB / 1.58kB 
1260
+
1261
+
1262
+
1263
+
1264
+
1265
+
1266
+
1267
+
1268
  ...point-1/training_args.bin: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.76kB / 4.86kB 
1269
+
1270
+
1271
+
1272
+
1273
+
1274
+
1275
+
1276
+
1277
+
1278
  ...t_model/training_args.bin: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.76kB / 4.86kB 
1279
+
1280
+
1281
+
1282
+
1283
+
1284
+
1285
+
1286
+
1287
+
1288
+
1289
  ...927196.dadb3a5bb633.549.0: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.54kB / 4.63kB 
1290
+
1291
+
1292
+
1293
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
1294
+
1295
+
1296
+
1297
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
1298
+
1299
+
1300
+
1301
+
1302
  ...point-1/model.safetensors: 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 82.8MB / 94.3MB 
1303
+
1304
+
1305
+
1306
+
1307
+
1308
  ...t_model/model.safetensors: 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 93.0MB / 94.3MB 
1309
+
1310
+
1311
+
1312
+
1313
+
1314
+
1315
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
1316
+
1317
+
1318
+
1319
+
1320
+
1321
+
1322
+
1323
  ...checkpoint-1/optimizer.pt: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 1.54kB / 1.58kB 
1324
+
1325
+
1326
+
1327
+
1328
+
1329
+
1330
+
1331
+
1332
  ...point-1/training_args.bin: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.76kB / 4.86kB 
1333
+
1334
+
1335
+
1336
+
1337
+
1338
+
1339
+
1340
+
1341
+
1342
  ...t_model/training_args.bin: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.76kB / 4.86kB 
1343
+
1344
+
1345
+
1346
+
1347
+
1348
+
1349
+
1350
+
1351
+
1352
+
1353
  ...927196.dadb3a5bb633.549.0: 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 4.54kB / 4.63kB 
1354
+
1355
+
1356
+
1357
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
1358
+
1359
+
1360
+
1361
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
1362
+
1363
+
1364
+
1365
+
1366
  ...point-1/model.safetensors: 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 90.6MB / 94.3MB 
1367
+
1368
+
1369
+
1370
+
1371
+
1372
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 94.0MB / 94.3MB 
1373
+
1374
+
1375
+
1376
+
1377
+
1378
+
1379
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
1380
+
1381
+
1382
+
1383
+
1384
+
1385
+
1386
+
1387
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
1388
+
1389
+
1390
+
1391
+
1392
+
1393
+
1394
+
1395
+
1396
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1397
+
1398
+
1399
+
1400
+
1401
+
1402
+
1403
+
1404
+
1405
+
1406
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1407
+
1408
+
1409
+
1410
+
1411
+
1412
+
1413
+
1414
+
1415
+
1416
+
1417
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
1418
+
1419
+
1420
+
1421
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
1422
+
1423
+
1424
+
1425
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
1426
+
1427
+
1428
+
1429
+
1430
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 93.9MB / 94.3MB 
1431
+
1432
+
1433
+
1434
+
1435
+
1436
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 94.3MB / 94.3MB 
1437
+
1438
+
1439
+
1440
+
1441
+
1442
+
1443
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
1444
+
1445
+
1446
+
1447
+
1448
+
1449
+
1450
+
1451
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
1452
+
1453
+
1454
+
1455
+
1456
+
1457
+
1458
+
1459
+
1460
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1461
+
1462
+
1463
+
1464
+
1465
+
1466
+
1467
+
1468
+
1469
+
1470
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1471
+
1472
+
1473
+
1474
+
1475
+
1476
+
1477
+
1478
+
1479
+
1480
+
1481
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
1482
+
1483
+
1484
+
1485
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
1486
+
1487
+
1488
+
1489
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
1490
+
1491
+
1492
+
1493
+
1494
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 93.9MB / 94.3MB 
1495
+
1496
+
1497
+
1498
+
1499
+
1500
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 94.3MB / 94.3MB 
1501
+
1502
+
1503
+
1504
+
1505
+
1506
+
1507
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
1508
+
1509
+
1510
+
1511
+
1512
+
1513
+
1514
+
1515
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
1516
+
1517
+
1518
+
1519
+
1520
+
1521
+
1522
+
1523
+
1524
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1525
+
1526
+
1527
+
1528
+
1529
+
1530
+
1531
+
1532
+
1533
+
1534
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1535
+
1536
+
1537
+
1538
+
1539
+
1540
+
1541
+
1542
+
1543
+
1544
+
1545
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
1546
+
1547
+
1548
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
1549
+
1550
+
1551
+
1552
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
1553
+
1554
+
1555
+
1556
+
1557
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 93.9MB / 94.3MB 
1558
+
1559
+
1560
+
1561
+
1562
+
1563
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 94.3MB / 94.3MB 
1564
+
1565
+
1566
+
1567
+
1568
+
1569
+
1570
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
1571
+
1572
+
1573
+
1574
+
1575
+
1576
+
1577
+
1578
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
1579
+
1580
+
1581
+
1582
+
1583
+
1584
+
1585
+
1586
+
1587
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1588
+
1589
+
1590
+
1591
+
1592
+
1593
+
1594
+
1595
+
1596
+
1597
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1598
+
1599
+
1600
+
1601
+
1602
+
1603
+
1604
+
1605
+
1606
+
1607
+
1608
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
1609
+
1610
+
1611
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
1612
+
1613
+
1614
+
1615
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
1616
+
1617
+
1618
+
1619
+
1620
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 93.9MB / 94.3MB 
1621
+
1622
+
1623
+
1624
+
1625
+
1626
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 94.3MB / 94.3MB 
1627
+
1628
+
1629
+
1630
+
1631
+
1632
+
1633
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
1634
+
1635
+
1636
+
1637
+
1638
+
1639
+
1640
+
1641
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
1642
+
1643
+
1644
+
1645
+
1646
+
1647
+
1648
+
1649
+
1650
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1651
+
1652
+
1653
+
1654
+
1655
+
1656
+
1657
+
1658
+
1659
+
1660
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1661
+
1662
+
1663
+
1664
+
1665
+
1666
+
1667
+
1668
+
1669
+
1670
+
1671
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
1672
+
1673
+
1674
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
1675
+
1676
+
1677
+
1678
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
1679
+
1680
+
1681
+
1682
+
1683
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 93.9MB / 94.3MB 
1684
+
1685
+
1686
+
1687
+
1688
+
1689
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 94.3MB / 94.3MB 
1690
+
1691
+
1692
+
1693
+
1694
+
1695
+
1696
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
1697
+
1698
+
1699
+
1700
+
1701
+
1702
+
1703
+
1704
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
1705
+
1706
+
1707
+
1708
+
1709
+
1710
+
1711
+
1712
+
1713
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1714
+
1715
+
1716
+
1717
+
1718
+
1719
+
1720
+
1721
+
1722
+
1723
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1724
+
1725
+
1726
+
1727
+
1728
+
1729
+
1730
+
1731
+
1732
+
1733
+
1734
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
1735
+
1736
+
1737
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
1738
+
1739
+
1740
+
1741
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
1742
+
1743
+
1744
+
1745
+
1746
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
1747
+
1748
+
1749
+
1750
+
1751
+
1752
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
1753
+
1754
+
1755
+
1756
+
1757
+
1758
+
1759
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
1760
+
1761
+
1762
+
1763
+
1764
+
1765
+
1766
+
1767
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
1768
+
1769
+
1770
+
1771
+
1772
+
1773
+
1774
+
1775
+
1776
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1777
+
1778
+
1779
+
1780
+
1781
+
1782
+
1783
+
1784
+
1785
+
1786
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1787
+
1788
+
1789
+
1790
+
1791
+
1792
+
1793
+
1794
+
1795
+
1796
+
1797
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
1798
+
1799
+
1800
+
1801
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
1802
+
1803
+
1804
+
1805
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
1806
+
1807
+
1808
+
1809
+
1810
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
1811
+
1812
+
1813
+
1814
+
1815
+
1816
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
1817
+
1818
+
1819
+
1820
+
1821
+
1822
+
1823
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
1824
+
1825
+
1826
+
1827
+
1828
+
1829
+
1830
+
1831
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
1832
+
1833
+
1834
+
1835
+
1836
+
1837
+
1838
+
1839
+
1840
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1841
+
1842
+
1843
+
1844
+
1845
+
1846
+
1847
+
1848
+
1849
+
1850
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1851
+
1852
+
1853
+
1854
+
1855
+
1856
+
1857
+
1858
+
1859
+
1860
+
1861
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
1862
+
1863
+
1864
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
1865
+
1866
+
1867
+
1868
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
1869
+
1870
+
1871
+
1872
+
1873
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
1874
+
1875
+
1876
+
1877
+
1878
+
1879
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
1880
+
1881
+
1882
+
1883
+
1884
+
1885
+
1886
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
1887
+
1888
+
1889
+
1890
+
1891
+
1892
+
1893
+
1894
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
1895
+
1896
+
1897
+
1898
+
1899
+
1900
+
1901
+
1902
+
1903
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1904
+
1905
+
1906
+
1907
+
1908
+
1909
+
1910
+
1911
+
1912
+
1913
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1914
+
1915
+
1916
+
1917
+
1918
+
1919
+
1920
+
1921
+
1922
+
1923
+
1924
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
1925
+
1926
+
1927
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
1928
+
1929
+
1930
+
1931
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
1932
+
1933
+
1934
+
1935
+
1936
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
1937
+
1938
+
1939
+
1940
+
1941
+
1942
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
1943
+
1944
+
1945
+
1946
+
1947
+
1948
+
1949
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
1950
+
1951
+
1952
+
1953
+
1954
+
1955
+
1956
+
1957
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
1958
+
1959
+
1960
+
1961
+
1962
+
1963
+
1964
+
1965
+
1966
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1967
+
1968
+
1969
+
1970
+
1971
+
1972
+
1973
+
1974
+
1975
+
1976
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
1977
+
1978
+
1979
+
1980
+
1981
+
1982
+
1983
+
1984
+
1985
+
1986
+
1987
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
1988
+
1989
+
1990
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
1991
+
1992
+
1993
+
1994
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
1995
+
1996
+
1997
+
1998
+
1999
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
2000
+
2001
+
2002
+
2003
+
2004
+
2005
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
2006
+
2007
+
2008
+
2009
+
2010
+
2011
+
2012
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
2013
+
2014
+
2015
+
2016
+
2017
+
2018
+
2019
+
2020
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
2021
+
2022
+
2023
+
2024
+
2025
+
2026
+
2027
+
2028
+
2029
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
2030
+
2031
+
2032
+
2033
+
2034
+
2035
+
2036
+
2037
+
2038
+
2039
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
2040
+
2041
+
2042
+
2043
+
2044
+
2045
+
2046
+
2047
+
2048
+
2049
+
2050
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
2051
+
2052
+
2053
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
2054
+
2055
+
2056
+
2057
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
2058
+
2059
+
2060
+
2061
+
2062
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
2063
+
2064
+
2065
+
2066
+
2067
+
2068
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
2069
+
2070
+
2071
+
2072
+
2073
+
2074
+
2075
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
2076
+
2077
+
2078
+
2079
+
2080
+
2081
+
2082
+
2083
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
2084
+
2085
+
2086
+
2087
+
2088
+
2089
+
2090
+
2091
+
2092
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
2093
+
2094
+
2095
+
2096
+
2097
+
2098
+
2099
+
2100
+
2101
+
2102
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
2103
+
2104
+
2105
+
2106
+
2107
+
2108
+
2109
+
2110
+
2111
+
2112
+
2113
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
2114
+
2115
+
2116
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B 
2117
+
2118
+
2119
+
2120
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB 
2121
+
2122
+
2123
+
2124
+
2125
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
2126
+
2127
+
2128
+
2129
+
2130
+
2131
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB 
2132
+
2133
+
2134
+
2135
+
2136
+
2137
+
2138
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB 
2139
+
2140
+
2141
+
2142
+
2143
+
2144
+
2145
+
2146
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB 
2147
+
2148
+
2149
+
2150
+
2151
+
2152
+
2153
+
2154
+
2155
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
2156
+
2157
+
2158
+
2159
+
2160
+
2161
+
2162
+
2163
+
2164
+
2165
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB 
2166
+
2167
+
2168
+
2169
+
2170
+
2171
+
2172
+
2173
+
2174
+
2175
+
2176
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB 
2177
+
2178
+
2179
  ...50/checkpoint-1/scaler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 988B / 988B
2180
+
2181
  ...heckpoint-1/rng_state.pth: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.2kB / 14.2kB
2182
+
2183
  ...point-1/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB
2184
+
2185
  ...t_model/model.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 94.3MB / 94.3MB
2186
+
2187
  ...checkpoint-1/scheduler.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06kB / 1.06kB
2188
+
2189
  ...checkpoint-1/optimizer.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.58kB / 1.58kB
2190
+
2191
  ...point-1/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB
2192
+
2193
  ...t_model/training_args.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.86kB / 4.86kB
2194
+
2195
  ...927196.dadb3a5bb633.549.0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.63kB / 4.63kB
2196
+ Upload complete.
2197
+ ==> Uploading training log to HuggingFace...