VNU-SecAlign / README.md
Jason-42195's picture
Update README.md
32c928d verified
---
model_name: VNU-SecAlign
language:
- en
tags:
- security
- instruction-following
- finetuning
license: apache-2.0
base_model: meta-llama/Llama-3.1-8B-Instruct
metrics:
- name: asr
type: error_rate
description: Attack success rate / ASR
datasets:
- Jason-42195/VNU-SecAlign
---
VNU-SecAlign: LoRA adapter and datasets for SecAlign experiments.
This repository contains:
- checkpoints/final_checkpoint: LoRA adapter and tokenizer files.
- data/: datasets used for evaluation and training (moved here).
Usage (load adapter with PEFT):
```python
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
# Config
base_model_id = "meta-llama/Llama-3.1-8B-Instruct"
repo_id = "Jason-42195/VNU-SecAlign"
adapter_subfolder = "checkpoints/final_checkpoint"
# Initialize tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
# Load original model
model = AutoModelForCausalLM.from_pretrained(
base_model_id,
device_map="auto",
trust_remote_code=True
)
# Load Adapter from subfolder
model = PeftModel.from_pretrained(
model,
repo_id,
subfolder=adapter_subfolder
)
model.eval()
```
Judge used in evaluation: GPT-4o (deployment `gpt-4o`, temperature=0.0).