lab2_efficient
Hyperparameters
- learning_rate: 2e-5
- per_device_train_batch_size: 128
- effective_batch_size: 128
- gradient_accumulation_steps: 1
- weight_decay: 0.1
- optimizer: adamw_torch
- fp16: True
- gradient_checkpointing: True
- lr_scheduler: cosine
- warmup_ratio: 0.1
- max_steps: 100
Results
| Metric | Value |
|---|---|
| BLEU | 44.113 |
| Eval Loss | 1.3546 |
| Train Steps | 100 |
| Epoch | 0.0615 |
- Downloads last month
- 23
Evaluation results
- BLEU on kde4self-reported44.113
- Loss on kde4self-reported1.355