Knowledge Continuity Regularized Network
Trainer Hyperparameters:
lr= 5e-05per_device_batch_size= 8gradient_accumulation_steps= 2weight_decay= 1e-09seed= 42
Regularization Hyperparameters
numerical stability denominator constant= 0.001lambda= 0.01alpha= 2.0beta= 2.0
Extended Logs:
| eval_loss | eval_accuracy | epoch |
|---|---|---|
| 14.951 | 0.792 | 0.67 |
| 14.971 | 0.792 | 2.0 |
| 14.796 | 0.792 | 2.67 |
| 14.792 | 0.792 | 4.0 |
| 14.929 | 0.792 | 4.67 |
| 14.329 | 0.792 | 6.0 |
| 13.731 | 0.833 | 6.67 |
| 13.271 | 0.875 | 8.0 |
| 12.991 | 0.875 | 8.67 |
| 12.470 | 0.917 | 10.0 |
| 12.915 | 0.875 | 10.67 |
| 13.256 | 0.792 | 12.0 |
| 13.199 | 0.792 | 12.67 |
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support