systemk
/

estimation-prometheus-gemma-2-2b

Generated from Trainer

Model card Files Files and versions

estimation-prometheus-gemma-2-2b / README.md

naive-puzzle's picture

End of training

13d70c4 verified 11 months ago

|

history blame contribute delete

3.36 kB

	---
	library_name: peft
	license: gemma
	base_model: google/gemma-2-2b-jpn-it
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	- spearmanr
	- pearsonr
	model-index:
	- name: estimation-prometheus-gemma-2-2b
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# estimation-prometheus-gemma-2-2b

	This model is a fine-tuned version of [google/gemma-2-2b-jpn-it](https://huggingface.co/google/gemma-2-2b-jpn-it) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.4192
	- Accuracy: 0.4235
	- Spearmanr: 0.4236
	- Kendalltau: 0.3285
	- Pearsonr: 0.4779
	- Rmse: 1.0679
	- Mae: 0.8129

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 8
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: cosine_with_min_lr
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 3.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| Spearmanr \| Kendalltau \| Pearsonr \| Rmse \| Mae \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|:--------:\|:---------:\|:----------:\|:--------:\|:------:\|:------:\|
	\| 4.38 \| 0.2094 \| 500 \| 4.5787 \| 0.3141 \| 0.0241 \| 0.0177 \| 0.0263 \| 1.3354 \| 1.0377 \|
	\| 3.4137 \| 0.4188 \| 1000 \| 3.0643 \| 0.3410 \| 0.1413 \| 0.1072 \| 0.1555 \| 1.2135 \| 0.9624 \|
	\| 2.8427 \| 0.6281 \| 1500 \| 2.7700 \| 0.3728 \| 0.2953 \| 0.2259 \| 0.3155 \| 1.1565 \| 0.8857 \|
	\| 2.7978 \| 0.8375 \| 2000 \| 2.6360 \| 0.3887 \| 0.3524 \| 0.2708 \| 0.3844 \| 1.1258 \| 0.8747 \|
	\| 2.5765 \| 1.0469 \| 2500 \| 2.5750 \| 0.4095 \| 0.3797 \| 0.2938 \| 0.4191 \| 1.1107 \| 0.8387 \|
	\| 2.5941 \| 1.2563 \| 3000 \| 2.5392 \| 0.4175 \| 0.3839 \| 0.2963 \| 0.4286 \| 1.1012 \| 0.8465 \|
	\| 2.3148 \| 1.4657 \| 3500 \| 2.4901 \| 0.4105 \| 0.4069 \| 0.3154 \| 0.4536 \| 1.0851 \| 0.8245 \|
	\| 2.6814 \| 1.6750 \| 4000 \| 2.4642 \| 0.4135 \| 0.4100 \| 0.3173 \| 0.4635 \| 1.0794 \| 0.8221 \|
	\| 2.5861 \| 1.8844 \| 4500 \| 2.4569 \| 0.4115 \| 0.4152 \| 0.3213 \| 0.4668 \| 1.0782 \| 0.8185 \|
	\| 2.5241 \| 2.0938 \| 5000 \| 2.4320 \| 0.4115 \| 0.4196 \| 0.3263 \| 0.4733 \| 1.0712 \| 0.8132 \|
	\| 2.5838 \| 2.3032 \| 5500 \| 2.4252 \| 0.4185 \| 0.4197 \| 0.3260 \| 0.4755 \| 1.0693 \| 0.8125 \|
	\| 2.4464 \| 2.5126 \| 6000 \| 2.4232 \| 0.4185 \| 0.4217 \| 0.3270 \| 0.4768 \| 1.0695 \| 0.8110 \|
	\| 2.4398 \| 2.7219 \| 6500 \| 2.4218 \| 0.4185 \| 0.4221 \| 0.3276 \| 0.4764 \| 1.0686 \| 0.8122 \|
	\| 2.3042 \| 2.9313 \| 7000 \| 2.4192 \| 0.4235 \| 0.4236 \| 0.3285 \| 0.4779 \| 1.0679 \| 0.8129 \|


	### Framework versions

	- PEFT 0.15.1
	- Transformers 4.50.2
	- Pytorch 2.5.1+cu124
	- Datasets 3.5.0
	- Tokenizers 0.21.1