--- model_name: VNU-SecAlign language: - en tags: - security - instruction-following - finetuning license: apache-2.0 base_model: meta-llama/Llama-3.1-8B-Instruct metrics: - name: asr type: error_rate description: Attack success rate / ASR datasets: - Jason-42195/VNU-SecAlign --- VNU-SecAlign: LoRA adapter and datasets for SecAlign experiments. This repository contains: - checkpoints/final_checkpoint: LoRA adapter and tokenizer files. - data/: datasets used for evaluation and training (moved here). Usage (load adapter with PEFT): ```python import torch from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer # Config base_model_id = "meta-llama/Llama-3.1-8B-Instruct" repo_id = "Jason-42195/VNU-SecAlign" adapter_subfolder = "checkpoints/final_checkpoint" # Initialize tokenizer tokenizer = AutoTokenizer.from_pretrained(base_model_id) # Load original model model = AutoModelForCausalLM.from_pretrained( base_model_id, device_map="auto", trust_remote_code=True ) # Load Adapter from subfolder model = PeftModel.from_pretrained( model, repo_id, subfolder=adapter_subfolder ) model.eval() ``` Judge used in evaluation: GPT-4o (deployment `gpt-4o`, temperature=0.0).