assemsabry commited on
Commit
7eaec99
·
verified ·
1 Parent(s): 10ef03d

Training details edit

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -145,7 +145,7 @@ The model was trained over approximately 26 hours using the STAM optimizer with
145
  **Training Throughput:**
146
 
147
  - Average step time: ~27 seconds
148
- - Peak GPU memory usage: ~14.8 GB per GPU
149
  - Total tokens processed: ~898M (input + target)
150
 
151
  ### Dataset Details
 
145
  **Training Throughput:**
146
 
147
  - Average step time: ~27 seconds
148
+ - Peak GPU memory usage: ~14.9 GB per GPU
149
  - Total tokens processed: ~898M (input + target)
150
 
151
  ### Dataset Details