SimonWSY
/

Hackduke-dpo-llm

Model card Files Files and versions

Hackduke-dpo-llm / hackduke1_DPO /finetuning_args.json

SimonWSY's picture

Upload 167 files

c62e26d over 2 years ago

history blame contribute delete

363 Bytes

	{
	"dpo_beta": 0.1,
	"finetuning_type": "lora",
	"lora_alpha": 32.0,
	"lora_dropout": 0.1,
	"lora_rank": 8,
	"lora_target": [
	"c_attn",
	"o_proj",
	"down_proj",
	"up_proj",
	"gate_proj"
	],
	"name_module_trainable": "mlp",
	"num_hidden_layers": 32,
	"num_layer_trainable": 3,
	"ppo_score_norm": false,
	"resume_lora_training": false
	}