alexaapo commited on
Commit
cdf101f
·
verified ·
1 Parent(s): b5b7353

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -127,9 +127,12 @@ The key hyperparameters used were:
127
 
128
  ### Training Results
129
 
130
- The model achieved stable convergence with the following characteristics:
131
 
 
 
132
  - **Training Infrastructure**: 8x NVIDIA A100 40GB GPUs
 
133
  - **Total Training Steps**: 120,000
134
  - **Distributed Training**: NCCL backend with enhanced stability settings
135
  - **Memory Optimization**: BFloat16 precision with gradient accumulation
 
127
 
128
  ### Training Results
129
 
130
+ The model achieved the following performance metrics:
131
 
132
+ - **Final Training Loss**: 1.2823
133
+ - **Final Evaluation Loss**: 1.1720
134
  - **Training Infrastructure**: 8x NVIDIA A100 40GB GPUs
135
+ - **Training Duration**: 262:24:39 hours
136
  - **Total Training Steps**: 120,000
137
  - **Distributed Training**: NCCL backend with enhanced stability settings
138
  - **Memory Optimization**: BFloat16 precision with gradient accumulation