felixwangg
/

Qwen2.5-Coder-7B-Instruct-cpp-sec-step150-lam1

Model card Files Files and versions

Qwen2.5-Coder-7B-Instruct-cpp-sec-step150-lam1 / README.md

felixwangg's picture

Add README with task vector combination details

5a92d1c verified 3 months ago

|

history blame contribute delete

1.82 kB

	# Combined Task Vector Model

	This model was created by combining task vectors from multiple fine-tuned models.

	## Task Vector Computation

	```python
	t_1 = TaskVector("Qwen/Qwen2.5-Coder-7B-Instruct", "/lustre10/scratch/tkwang/SecSteer/axolotl-outputs/lora-merged/plus-step-150")
	t_2 = TaskVector("Qwen/Qwen2.5-Coder-7B-Instruct", "/lustre10/scratch/tkwang/SecSteer/axolotl-outputs/lora-merged/minus-step-150")
	t_2 = TaskVector("Qwen/Qwen2.5-Coder-7B-Instruct", "None")
	t_combined = 1.0 * t_1 + -1.0 * t_2
	new_model = t_combined.apply_to("Qwen/Qwen2.5-Coder-7B-Instruct", scaling_coef=1.0)
	```

	Models Used

	- Base Model: https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct
	- Fine-tuned Model 1: https://huggingface.co//lustre10/scratch/tkwang/SecSteer/axolotl-outputs/lora-merged/plus-step-150
	- Fine-tuned Model 2: https://huggingface.co//lustre10/scratch/tkwang/SecSteer/axolotl-outputs/lora-merged/minus-step-150
	- Fine-tuned Model 3: https://huggingface.co/None

	Technical Details

	- Creation Script Git Hash: fb62f919e9796b294f1ffb6297b05d11fa945ac0
	- Task Vector Method: Additive combination
	- Args: {
	"pretrained_model": "Qwen/Qwen2.5-Coder-7B-Instruct",
	"finetuned_model1": "/lustre10/scratch/tkwang/SecSteer/axolotl-outputs/lora-merged/plus-step-150",
	"finetuned_model2": "/lustre10/scratch/tkwang/SecSteer/axolotl-outputs/lora-merged/minus-step-150",
	"finetuned_model3": null,
	"apply_to_diff_model_architecure": null,
	"output_model_name": "felixwangg/Qwen2.5-Coder-7B-Instruct-cpp-sec-step150-lam1",
	"output_dir": "/lustre10/scratch/tkwang/SecSteer/axolotl-outputs/weight-arithmetic/step-150/lambda-1",
	"scaling_coef": 1.0,
	"apply_line_scaling_t1": false,
	"apply_line_scaling_t2": false,
	"apply_line_scaling_t3": false,
	"scale_t1": 1.0,
	"scale_t2": -1.0,
	"scale_t3": null
	}