|
|
--- |
|
|
library_name: transformers |
|
|
license: other |
|
|
base_model: Qwen/Qwen2.5-3B-Instruct |
|
|
tags: |
|
|
- llama-factory |
|
|
- full |
|
|
- generated_from_trainer |
|
|
model-index: |
|
|
- name: web_policy_sft |
|
|
results: [] |
|
|
--- |
|
|
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
|
|
# web_policy_sft |
|
|
|
|
|
This model is a fine-tuned version of [/data/hzy/models--Qwen--Qwen2.5-3B-Instruct/snapshots/aa8e72537993ba99e69dfaafa59ed015b17504d1](https://huggingface.co//data/hzy/models--Qwen--Qwen2.5-3B-Instruct/snapshots/aa8e72537993ba99e69dfaafa59ed015b17504d1) on the web_policy_sft dataset. |
|
|
It achieves the following results on the evaluation set: |
|
|
- Loss: 0.1625 |
|
|
|
|
|
## Model description |
|
|
|
|
|
More information needed |
|
|
|
|
|
## Intended uses & limitations |
|
|
|
|
|
More information needed |
|
|
|
|
|
## Training and evaluation data |
|
|
|
|
|
More information needed |
|
|
|
|
|
## Training procedure |
|
|
|
|
|
### Training hyperparameters |
|
|
|
|
|
The following hyperparameters were used during training: |
|
|
- learning_rate: 1e-05 |
|
|
- train_batch_size: 1 |
|
|
- eval_batch_size: 1 |
|
|
- seed: 42 |
|
|
- distributed_type: multi-GPU |
|
|
- num_devices: 2 |
|
|
- gradient_accumulation_steps: 4 |
|
|
- total_train_batch_size: 8 |
|
|
- total_eval_batch_size: 2 |
|
|
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments |
|
|
- lr_scheduler_type: cosine |
|
|
- lr_scheduler_warmup_ratio: 0.1 |
|
|
- num_epochs: 1 |
|
|
|
|
|
### Training results |
|
|
|
|
|
| Training Loss | Epoch | Step | Validation Loss | |
|
|
|:-------------:|:------:|:----:|:---------------:| |
|
|
| 0.5121 | 0.0470 | 50 | 0.4876 | |
|
|
| 0.4615 | 0.0940 | 100 | 0.3849 | |
|
|
| 0.37 | 0.1409 | 150 | 0.3281 | |
|
|
| 0.3749 | 0.1879 | 200 | 0.2892 | |
|
|
| 0.2863 | 0.2349 | 250 | 0.2757 | |
|
|
| 0.3078 | 0.2819 | 300 | 0.2549 | |
|
|
| 0.2921 | 0.3289 | 350 | 0.2316 | |
|
|
| 0.3191 | 0.3759 | 400 | 0.2353 | |
|
|
| 0.313 | 0.4228 | 450 | 0.2231 | |
|
|
| 0.2037 | 0.4698 | 500 | 0.2138 | |
|
|
| 0.1729 | 0.5168 | 550 | 0.2074 | |
|
|
| 0.289 | 0.5638 | 600 | 0.1954 | |
|
|
| 0.2775 | 0.6108 | 650 | 0.1897 | |
|
|
| 0.1546 | 0.6577 | 700 | 0.1814 | |
|
|
| 0.1613 | 0.7047 | 750 | 0.1746 | |
|
|
| 0.0956 | 0.7517 | 800 | 0.1725 | |
|
|
| 0.1692 | 0.7987 | 850 | 0.1683 | |
|
|
| 0.1885 | 0.8457 | 900 | 0.1653 | |
|
|
| 0.2799 | 0.8926 | 950 | 0.1637 | |
|
|
| 0.1971 | 0.9396 | 1000 | 0.1628 | |
|
|
| 0.1464 | 0.9866 | 1050 | 0.1626 | |
|
|
|
|
|
|
|
|
### Framework versions |
|
|
|
|
|
- Transformers 4.57.1 |
|
|
- Pytorch 2.7.1+cu126 |
|
|
- Datasets 4.0.0 |
|
|
- Tokenizers 0.22.1 |
|
|
|