AIM-Intelligence
/

COMPASS_Qwen2.5-7B-Instruct_LoRA

Text Generation

Generated from Trainer

policy-compliance

policy-alignment

text-generation-inference

Model card Files Files and versions

COMPASS_Qwen2.5-7B-Instruct_LoRA / README.md

Dasool's picture

Update README.md

416a2a8 verified 15 days ago

|

history blame contribute delete

2.11 kB

	---
	language:
	- en
	base_model: Qwen/Qwen2.5-7B-Instruct
	library_name: transformers
	model_name: COMPASS_Qwen2.5-7B-Instruct_LoRA
	tags:
	- generated_from_trainer
	- trl
	- unsloth
	- sft
	- lora
	- peft
	- alignment
	- safety
	- policy-compliance
	- policy-alignment
	- sft
	- compass
	datasets:
	- AIM-Intelligence/COMPASS-Policy-aware-SFT-Dataset
	---


	# COMPASS Qwen2.5-7B-Instruct LoRA (Policy-aware LODO SFT)

	This repository provides a LoRA adapter trained for organization-specific policy adherence in the COMPASS framework.

	## Training Data

	[Policy-aware SFT dataset](https://huggingface.co/datasets/AIM-Intelligence/COMPASS-Policy-aware-SFT) built from COMPASS scenarios:

	- Setup: Leave-One-Domain-Out (LODO)
	- Held-out domain: TelePath (Telecom)
	- Train domains (7): AutoViaMotors, CityGov, FinSecure, MediCarePlus, PlanMyTrip, TutoraVerse, VirtuRecruit
	- Training size: 4,121 query–response pairs

	Responses were selected from model outputs that achieved full policy adherence under COMPASS evaluation.

	## Training Configuration

	- Method: LoRA adapters
	- Epochs: 3
	- LoRA rank (r): 64
	- LoRA alpha: 128
	- Peak learning rate: 5e-4
	- Optimizer: AdamW
	- Batch size: 32
	- LR schedule: cosine
	- Quantization: 8-bit during training

	## Evaluation (Held-out TelePath Domain)

	Policy Alignment Score (PAS) breakdown on TelePath:

	\| Model \| Method \| Allowed Base \| Allowed Edge \| Denied Base \| Denied Edge \|
	\|---\|---\|---:\|---:\|---:\|---:\|
	\| Qwen2.5-7B-Instruct \| Base system prompt \| 96.67 \| 85.71 \| 24.00 \| 0.00 \|
	\| Qwen2.5-7B-Instruct \| LODO SFT (LoRA) \| 96.67 \| 89.52 \| 71.74 \| 60.49 \|


	## Citation
	```
	@misc{choi2026compass,
	title={COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs},
	author={Dasol Choi and DongGeon Lee and Brigitta Jesica Kartono and Helena Berndt and Taeyoun Kwon and Joonwon Jang and Haon Park and Hwanjo Yu and Minsuk Kahng},
	year={2026},
	eprint={2601.01836},
	archivePrefix={arXiv},
	primaryClass={cs.AI},
	url={https://arxiv.org/abs/2601.01836},
	}
	```