athirorg
/

USS-reward-model-grl-source

Generated from Trainer

Model card Files Files and versions

athirorg commited on about 9 hours ago

Commit

5d87922

·

verified ·

1 Parent(s): d048d95

End of training

Files changed (2) hide show

README.md +75 -0
model.safetensors +1 -1

README.md ADDED Viewed

	@@ -0,0 +1,75 @@

+---
+library_name: transformers
+license: apache-2.0
+base_model: answerdotai/ModernBERT-large
+tags:
+- generated_from_trainer
+model-index:
+- name: USS-reward-model-grl-source
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# USS-reward-model-grl-source
+This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 52.9809
+- Mse: 0.1859
+- Mae: 0.2835
+- R2: 0.0363
+- Spearman Correlation: 0.1418
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 2
+- eval_batch_size: 1
+- seed: 42
+- distributed_type: multi-GPU
+- gradient_accumulation_steps: 10
+- total_train_batch_size: 20
+- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- num_epochs: 10
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Mse    | Mae    | R2      | Spearman Correlation |
+|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:-------:|:--------------------:|
+| 22.9021       | 1.0   | 97   | 1.5700          | 0.1938 | 0.2856 | -0.0045 | nan                  |
+| 17.8187       | 2.0   | 194  | 2.1893          | 0.2106 | 0.3009 | -0.0917 | 0.1105               |
+| 290.6176      | 3.0   | 291  | 47.6834         | 0.5032 | 0.5697 | -1.6088 | 0.1146               |
+| 518.1736      | 4.0   | 388  | 54.6556         | 0.2628 | 0.3755 | -0.3622 | 0.1508               |
+| 567.3146      | 5.0   | 485  | 57.8539         | 0.3017 | 0.4718 | -0.5641 | 0.1994               |
+| 574.1879      | 6.0   | 582  | 56.7874         | 0.2290 | 0.4001 | -0.1873 | 0.1062               |
+| 559.3920      | 7.0   | 679  | 55.2139         | 0.2447 | 0.3755 | -0.2685 | 0.1472               |
+| 544.8656      | 8.0   | 776  | 54.0387         | 0.2628 | 0.3924 | -0.3626 | 0.1332               |
+| 535.0054      | 9.0   | 873  | 53.2574         | 0.2067 | 0.2983 | -0.0716 | 0.1215               |
+| 529.6823      | 10.0  | 970  | 52.9809         | 0.1859 | 0.2835 | 0.0363  | 0.1418               |
+### Framework versions
+- Transformers 5.9.0
+- Pytorch 2.12.0+cu130
+- Datasets 4.8.5
+- Tokenizers 0.22.2

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:22748da9fdcfcdd9b4c1ccafd3d5afbeb15330c884f254eaaa53f5c416e2cfe4
 size 1583364164

 version https://git-lfs.github.com/spec/v1
+oid sha256:febd21b4644fcd5625ec42fb0cb00774fbdcf374b924d54f834c96f50a0d4f0d
 size 1583364164