alexshah
/

armembed

@@ -125,14 +125,14 @@ print(scores.tolist())
 - **Warmup Steps**: 200
 - **Max Sequence Length**: 512 tokens
 - **FP16 Training**: Enabled
-- **Gradient Clipping**: Likely 1.0
 ### Optimization Configuration
 - **Framework**: DeepSpeed ZeRO Stage 3
 - **Optimizer**: AdamW with auto learning rate and weight decay
 - **Mixed Precision**: bfloat16 (`bf16`) enabled
 - **ZeRO Optimization**: Stage 3 with:
-  - No parameter offloading (`device: none`)
   - Overlap communication enabled
   - Contiguous gradients enabled
   - Auto-tuned reduce and prefetch bucket sizes
@@ -149,7 +149,7 @@ print(scores.tolist())
 ## Model Details
 - **Model Name**: ArmEmbed
-- **Model Type**: Text Embeddings for Armenian Language
 - **Version**: 1.0.0
 - **License**: Apache 2.0
 - **Last Updated**: May 2025

 - **Warmup Steps**: 200
 - **Max Sequence Length**: 512 tokens
 - **FP16 Training**: Enabled
+- **Gradient Clipping**: 1.0
 ### Optimization Configuration
 - **Framework**: DeepSpeed ZeRO Stage 3
 - **Optimizer**: AdamW with auto learning rate and weight decay
 - **Mixed Precision**: bfloat16 (`bf16`) enabled
 - **ZeRO Optimization**: Stage 3 with:
+  - No parameter offloading
   - Overlap communication enabled
   - Contiguous gradients enabled
   - Auto-tuned reduce and prefetch bucket sizes
 ## Model Details
 - **Model Name**: ArmEmbed
+- **Model Type**: Text Embeddings for the Armenian Language
 - **Version**: 1.0.0
 - **License**: Apache 2.0
 - **Last Updated**: May 2025