Initial model upload

00f23b3 verified 8 months ago

6.89 kB

	---
	license: apache-2.0
	base_model: Qwen/Qwen2.5-7B-Instruct
	tags:
	- generated_from_trainer
	model-index:
	- name: prm
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
	<details><summary>See axolotl config</summary>

	axolotl version: `0.4.1`
	```yaml
	base_model: Qwen/Qwen2.5-7B-Instruct
	model_type: AutoModelForCausalLM
	tokenizer_type: AutoTokenizer

	load_in_8bit: false
	load_in_4bit: false
	strict: false

	datasets:
	- path: Jennny/direct_label_rolls
	conversation: qwen-7b-chat
	type: sharegpt
	split: "train"
	train_on_split: "train"

	warmup_ratio: 0.05
	val_set_size: 0.0
	output_dir: ./prm
	wandb_project: preference-models
	# wandb_entity: domain-generalization
	wandb_watch:
	wandb_name: "qwen-7b-bs32_lr2e-6_prm"
	wandb_log_model:

	train_on_inputs: false

	save_safetensors: true
	#noisy_embedding_alpha: 10.0 # default for sharegpt type
	dataset_prepared_path: ~/data/preference-models/last_run_prepared

	dataset_processes: 48
	#torch_compile: true
	sequence_len: 8192
	sample_packing: true
	pad_to_sequence_len: true

	trust_remote_code: True
	adapter:
	lora_model_dir:
	#lora_r: 32
	#lora_alpha: 16
	#lora_dropout: 0.05
	#lora_target_linear: true
	#lora_fan_in_fan_out:

	gradient_checkpointing: True

	#warmup_ratio: 0.1
	gradient_accumulation_steps: 4
	micro_batch_size: 1
	num_epochs: 1
	#max_steps: 10
	#optimizer: adamw_torch_fused
	optimizer: paged_adamw_32bit
	#lr_scheduler: constant_with_warmup
	lr_scheduler: cosine
	learning_rate: 2.0e-6

	weight_decay: 0.0
	max_grad_norm: 1.0

	group_by_length: false
	bf16: auto
	fp16: false
	tf32: true

	early_stopping_patience:
	local_rank:
	logging_steps: 2
	xformers_attention:
	flash_attention: true

	eval_steps:
	eval_table_size:
	eval_table_max_new_tokens:
	#save_steps: 100
	save_strategy: "epoch"
	save_total_limit: 4
	#save_safetensors: false
	debug:

	ddp: #true
	deepspeed: #deepspeed/zero1.json # multi-gpu only

	fsdp:
	fsdp_config:
	special_tokens:
	pad_token: <\|end_of_text\|>

	```

	</details><br>

	# prm

	This model is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.0487

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 1
	- eval_batch_size: 1
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 8
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 32
	- total_eval_batch_size: 8
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 3
	- num_epochs: 2

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|
	\| No log \| 0.0290 \| 1 \| 3.8909 \|
	\| 3.8462 \| 0.0580 \| 2 \| 3.1606 \|
	\| 3.8462 \| 0.0870 \| 3 \| 1.4003 \|
	\| 2.3026 \| 0.1159 \| 4 \| 0.5247 \|
	\| 2.3026 \| 0.1449 \| 5 \| 0.2535 \|
	\| 0.3725 \| 0.1739 \| 6 \| 0.1224 \|
	\| 0.3725 \| 0.2029 \| 7 \| 0.0711 \|
	\| 0.1704 \| 0.2319 \| 8 \| 0.0705 \|
	\| 0.1704 \| 0.2609 \| 9 \| 0.0842 \|
	\| 0.0719 \| 0.2899 \| 10 \| 0.0684 \|
	\| 0.0719 \| 0.3188 \| 11 \| 0.0837 \|
	\| 0.0719 \| 0.3478 \| 12 \| 0.0794 \|
	\| 0.0719 \| 0.3768 \| 13 \| 0.0679 \|
	\| 0.0729 \| 0.4058 \| 14 \| 0.0607 \|
	\| 0.0729 \| 0.4348 \| 15 \| 0.0682 \|
	\| 0.0639 \| 0.4638 \| 16 \| 0.0660 \|
	\| 0.0639 \| 0.4928 \| 17 \| 0.0607 \|
	\| 0.0659 \| 0.5217 \| 18 \| 0.0609 \|
	\| 0.0659 \| 0.5507 \| 19 \| 0.0599 \|
	\| 0.0584 \| 0.5797 \| 20 \| 0.0595 \|
	\| 0.0584 \| 0.6087 \| 21 \| 0.0579 \|
	\| 0.059 \| 0.6377 \| 22 \| 0.0572 \|
	\| 0.059 \| 0.6667 \| 23 \| 0.0579 \|
	\| 0.1069 \| 0.6957 \| 24 \| 0.0617 \|
	\| 0.1069 \| 0.7246 \| 25 \| 0.0601 \|
	\| 0.0585 \| 0.7536 \| 26 \| 0.0563 \|
	\| 0.0585 \| 0.7826 \| 27 \| 0.0598 \|
	\| 0.097 \| 0.8116 \| 28 \| 0.0590 \|
	\| 0.097 \| 0.8406 \| 29 \| 0.0548 \|
	\| 0.059 \| 0.8696 \| 30 \| 0.0559 \|
	\| 0.059 \| 0.8986 \| 31 \| 0.0570 \|
	\| 0.0695 \| 0.9275 \| 32 \| 0.0548 \|
	\| 0.0695 \| 0.9565 \| 33 \| 0.0554 \|
	\| 0.0533 \| 0.9855 \| 34 \| 0.0564 \|
	\| 0.0533 \| 1.0145 \| 35 \| 0.0541 \|
	\| 0.0544 \| 1.0145 \| 36 \| 0.0548 \|
	\| 0.0544 \| 1.0435 \| 37 \| 0.0555 \|
	\| 0.0555 \| 1.0725 \| 38 \| 0.0531 \|
	\| 0.0555 \| 1.1014 \| 39 \| 0.0532 \|
	\| 0.0524 \| 1.1304 \| 40 \| 0.0536 \|
	\| 0.0524 \| 1.1594 \| 41 \| 0.0519 \|
	\| 0.0641 \| 1.1884 \| 42 \| 0.0520 \|
	\| 0.0641 \| 1.2174 \| 43 \| 0.0522 \|
	\| 0.0494 \| 1.2464 \| 44 \| 0.0514 \|
	\| 0.0494 \| 1.2754 \| 45 \| 0.0511 \|
	\| 0.0502 \| 1.3043 \| 46 \| 0.0514 \|
	\| 0.0502 \| 1.3333 \| 47 \| 0.0511 \|
	\| 0.0482 \| 1.3623 \| 48 \| 0.0505 \|
	\| 0.0482 \| 1.3913 \| 49 \| 0.0511 \|
	\| 0.0472 \| 1.4203 \| 50 \| 0.0509 \|
	\| 0.0472 \| 1.4493 \| 51 \| 0.0498 \|
	\| 0.0478 \| 1.4783 \| 52 \| 0.0498 \|
	\| 0.0478 \| 1.5072 \| 53 \| 0.0502 \|
	\| 0.055 \| 1.5362 \| 54 \| 0.0499 \|
	\| 0.055 \| 1.5652 \| 55 \| 0.0493 \|
	\| 0.0459 \| 1.5942 \| 56 \| 0.0493 \|
	\| 0.0459 \| 1.6232 \| 57 \| 0.0497 \|
	\| 0.0492 \| 1.6522 \| 58 \| 0.0497 \|
	\| 0.0492 \| 1.6812 \| 59 \| 0.0494 \|
	\| 0.0504 \| 1.7101 \| 60 \| 0.0490 \|
	\| 0.0504 \| 1.7391 \| 61 \| 0.0488 \|
	\| 0.0564 \| 1.7681 \| 62 \| 0.0488 \|
	\| 0.0564 \| 1.7971 \| 63 \| 0.0488 \|
	\| 0.0503 \| 1.8261 \| 64 \| 0.0488 \|
	\| 0.0503 \| 1.8551 \| 65 \| 0.0487 \|
	\| 0.0495 \| 1.8841 \| 66 \| 0.0487 \|
	\| 0.0495 \| 1.9130 \| 67 \| 0.0487 \|
	\| 0.0446 \| 1.9420 \| 68 \| 0.0487 \|


	### Framework versions

	- Transformers 4.43.3
	- Pytorch 2.1.2+cu121
	- Datasets 2.19.1
	- Tokenizers 0.19.1