VNU-SecAlign / README.md
Jason-42195's picture
Update README.md
32c928d verified
metadata
model_name: VNU-SecAlign
language:
  - en
tags:
  - security
  - instruction-following
  - finetuning
license: apache-2.0
base_model: meta-llama/Llama-3.1-8B-Instruct
metrics:
  - name: asr
    type: error_rate
    description: Attack success rate / ASR
datasets:
  - Jason-42195/VNU-SecAlign

VNU-SecAlign: LoRA adapter and datasets for SecAlign experiments.

This repository contains:

  • checkpoints/final_checkpoint: LoRA adapter and tokenizer files.
  • data/: datasets used for evaluation and training (moved here).

Usage (load adapter with PEFT):

import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Config
base_model_id = "meta-llama/Llama-3.1-8B-Instruct"
repo_id = "Jason-42195/VNU-SecAlign"
adapter_subfolder = "checkpoints/final_checkpoint"

# Initialize tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

# Load original model
model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    device_map="auto",
    trust_remote_code=True
)

# Load Adapter from subfolder 
model = PeftModel.from_pretrained(
    model,
    repo_id,
    subfolder=adapter_subfolder
)

model.eval()

Judge used in evaluation: GPT-4o (deployment gpt-4o, temperature=0.0).