|
|
--- |
|
|
library_name: peft |
|
|
license: gemma |
|
|
base_model: google/gemma-2-2b-jpn-it |
|
|
tags: |
|
|
- generated_from_trainer |
|
|
model-index: |
|
|
- name: gemma-2-2b-evaluator-v2 |
|
|
results: [] |
|
|
--- |
|
|
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
|
|
# gemma-2-2b-evaluator-v2 |
|
|
|
|
|
This model is a fine-tuned version of [google/gemma-2-2b-jpn-it](https://huggingface.co/google/gemma-2-2b-jpn-it) on an unknown dataset. |
|
|
It achieves the following results on the evaluation set: |
|
|
- Loss: 0.8410 |
|
|
- Helpfulness Accuracy: 0.5109 |
|
|
- Helpfulness Spearmanr: 0.5327 |
|
|
- Helpfulness Kendalltau: 0.3736 |
|
|
- Helpfulness Pearsonr: 0.5339 |
|
|
- Helpfulness Rmse: 0.5903 |
|
|
- Helpfulness Mae: 0.4713 |
|
|
- Correctness Accuracy: 0.6014 |
|
|
- Correctness Spearmanr: 0.5531 |
|
|
- Correctness Kendalltau: 0.3943 |
|
|
- Correctness Pearsonr: 0.5859 |
|
|
- Correctness Rmse: 0.5292 |
|
|
- Correctness Mae: 0.4221 |
|
|
- Coherence Accuracy: 0.6998 |
|
|
- Coherence Spearmanr: 0.4640 |
|
|
- Coherence Kendalltau: 0.3232 |
|
|
- Coherence Pearsonr: 0.4909 |
|
|
- Coherence Rmse: 0.5011 |
|
|
- Coherence Mae: 0.4405 |
|
|
- Complexity Accuracy: 0.6064 |
|
|
- Complexity Spearmanr: -0.0021 |
|
|
- Complexity Kendalltau: -0.0026 |
|
|
- Complexity Pearsonr: -0.0076 |
|
|
- Complexity Rmse: 0.3605 |
|
|
- Complexity Mae: 0.3205 |
|
|
- Verbosity Accuracy: 0.6362 |
|
|
- Verbosity Spearmanr: 0.4071 |
|
|
- Verbosity Kendalltau: 0.2757 |
|
|
- Verbosity Pearsonr: 0.3365 |
|
|
- Verbosity Rmse: 0.3724 |
|
|
- Verbosity Mae: 0.3068 |
|
|
- Avg Accuracy: 0.6109 |
|
|
- Avg Spearmanr: 0.3910 |
|
|
- Avg Kendalltau: 0.2728 |
|
|
- Avg Pearsonr: 0.3879 |
|
|
- Avg Rmse: 0.4707 |
|
|
- Avg Mae: 0.3922 |
|
|
|
|
|
## Model description |
|
|
|
|
|
More information needed |
|
|
|
|
|
## Intended uses & limitations |
|
|
|
|
|
More information needed |
|
|
|
|
|
## Training and evaluation data |
|
|
|
|
|
More information needed |
|
|
|
|
|
## Training procedure |
|
|
|
|
|
### Training hyperparameters |
|
|
|
|
|
The following hyperparameters were used during training: |
|
|
- learning_rate: 1e-05 |
|
|
- train_batch_size: 2 |
|
|
- eval_batch_size: 16 |
|
|
- seed: 42 |
|
|
- gradient_accumulation_steps: 4 |
|
|
- total_train_batch_size: 8 |
|
|
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments |
|
|
- lr_scheduler_type: cosine_with_min_lr |
|
|
- lr_scheduler_warmup_ratio: 0.1 |
|
|
- num_epochs: 1.0 |
|
|
|
|
|
### Training results |
|
|
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Helpfulness Accuracy | Helpfulness Spearmanr | Helpfulness Kendalltau | Helpfulness Pearsonr | Helpfulness Rmse | Helpfulness Mae | Correctness Accuracy | Correctness Spearmanr | Correctness Kendalltau | Correctness Pearsonr | Correctness Rmse | Correctness Mae | Coherence Accuracy | Coherence Spearmanr | Coherence Kendalltau | Coherence Pearsonr | Coherence Rmse | Coherence Mae | Complexity Accuracy | Complexity Spearmanr | Complexity Kendalltau | Complexity Pearsonr | Complexity Rmse | Complexity Mae | Verbosity Accuracy | Verbosity Spearmanr | Verbosity Kendalltau | Verbosity Pearsonr | Verbosity Rmse | Verbosity Mae | Avg Accuracy | Avg Spearmanr | Avg Kendalltau | Avg Pearsonr | Avg Rmse | Avg Mae | |
|
|
|:-------------:|:------:|:----:|:---------------:|:--------------------:|:---------------------:|:----------------------:|:--------------------:|:----------------:|:---------------:|:--------------------:|:---------------------:|:----------------------:|:--------------------:|:----------------:|:---------------:|:------------------:|:-------------------:|:--------------------:|:------------------:|:--------------:|:-------------:|:-------------------:|:--------------------:|:---------------------:|:-------------------:|:---------------:|:--------------:|:------------------:|:-------------------:|:--------------------:|:------------------:|:--------------:|:-------------:|:------------:|:-------------:|:--------------:|:------------:|:--------:|:-------:| |
|
|
| No log | 0 | 0 | 6.0772 | 0.3807 | 0.0234 | 0.0151 | 0.0297 | 0.9409 | 0.7463 | 0.1471 | 0.0071 | 0.0048 | 0.0298 | 1.3459 | 1.1590 | 0.3698 | 0.0093 | 0.0059 | 0.0060 | 1.2779 | 1.0608 | 0.0258 | -0.0438 | -0.0291 | -0.0464 | 2.1264 | 1.9972 | 0.3698 | -0.0471 | -0.0310 | -0.0500 | 1.0197 | 0.7998 | 0.2586 | -0.0102 | -0.0069 | -0.0062 | 1.3422 | 1.1526 | |
|
|
| 1.2731 | 0.2094 | 500 | 1.2058 | 0.4404 | 0.1512 | 0.0986 | 0.1462 | 0.7513 | 0.6121 | 0.5338 | 0.1290 | 0.0836 | 0.1806 | 0.6719 | 0.5388 | 0.6700 | 0.1137 | 0.0764 | 0.1443 | 0.5356 | 0.4510 | 0.6054 | -0.2206 | -0.1464 | -0.0710 | 0.4624 | 0.4287 | 0.6243 | 0.0942 | 0.0622 | -0.0191 | 0.4919 | 0.4087 | 0.5748 | 0.0535 | 0.0349 | 0.0762 | 0.5826 | 0.4878 | |
|
|
| 0.9212 | 0.4188 | 1000 | 0.9210 | 0.4980 | 0.4385 | 0.3020 | 0.4357 | 0.5932 | 0.4747 | 0.5944 | 0.4142 | 0.2897 | 0.4702 | 0.5473 | 0.4357 | 0.7068 | 0.3103 | 0.2104 | 0.3688 | 0.4894 | 0.4244 | 0.6054 | -0.2491 | -0.1664 | -0.2532 | 0.4269 | 0.3880 | 0.6362 | 0.2771 | 0.1860 | 0.1669 | 0.3959 | 0.3246 | 0.6082 | 0.2382 | 0.1643 | 0.2377 | 0.4905 | 0.4095 | |
|
|
| 0.8859 | 0.6283 | 1500 | 0.8554 | 0.4911 | 0.5129 | 0.3572 | 0.5111 | 0.5972 | 0.4769 | 0.5755 | 0.5366 | 0.3813 | 0.5659 | 0.5430 | 0.4334 | 0.6958 | 0.4329 | 0.3006 | 0.4621 | 0.5038 | 0.4418 | 0.6054 | -0.1199 | -0.0813 | -0.1286 | 0.3882 | 0.3489 | 0.6362 | 0.3749 | 0.2544 | 0.2937 | 0.3931 | 0.3275 | 0.6008 | 0.3475 | 0.2424 | 0.3409 | 0.4851 | 0.4057 | |
|
|
| 0.7737 | 0.8377 | 2000 | 0.8410 | 0.5109 | 0.5327 | 0.3736 | 0.5339 | 0.5903 | 0.4713 | 0.6014 | 0.5531 | 0.3943 | 0.5859 | 0.5292 | 0.4221 | 0.6998 | 0.4640 | 0.3232 | 0.4909 | 0.5011 | 0.4405 | 0.6064 | -0.0021 | -0.0026 | -0.0076 | 0.3605 | 0.3205 | 0.6362 | 0.4071 | 0.2757 | 0.3365 | 0.3724 | 0.3068 | 0.6109 | 0.3910 | 0.2728 | 0.3879 | 0.4707 | 0.3922 | |
|
|
|
|
|
|
|
|
### Framework versions |
|
|
|
|
|
- PEFT 0.15.0 |
|
|
- Transformers 4.50.1 |
|
|
- Pytorch 2.6.0+cu124 |
|
|
- Datasets 3.4.1 |
|
|
- Tokenizers 0.21.1 |