Update README.md
Browse files
README.md
CHANGED
|
@@ -50,10 +50,10 @@ The model was fine-tuned using a private Bitext dataset designed for question an
|
|
| 50 |
|
| 51 |
- **Optimizer**: AdamW with betas=(0.9, 0.999) and epsilon=1e-08
|
| 52 |
- **Learning Rate**: 0.0002 with a cosine learning rate scheduler
|
| 53 |
-
- **Epochs**:
|
| 54 |
-
- **Batch Size**:
|
| 55 |
-
- **Gradient Accumulation Steps**:
|
| 56 |
-
- **Maximum Sequence Length**:
|
| 57 |
|
| 58 |
### Environment
|
| 59 |
|
|
|
|
| 50 |
|
| 51 |
- **Optimizer**: AdamW with betas=(0.9, 0.999) and epsilon=1e-08
|
| 52 |
- **Learning Rate**: 0.0002 with a cosine learning rate scheduler
|
| 53 |
+
- **Epochs**: 4
|
| 54 |
+
- **Batch Size**: 10
|
| 55 |
+
- **Gradient Accumulation Steps**: 8
|
| 56 |
+
- **Maximum Sequence Length**: 8192 tokens
|
| 57 |
|
| 58 |
### Environment
|
| 59 |
|