COMPASS
Collection
A Framework for Evaluating Organization-Specific Policy Alignment in LLMs
•
5 items
•
Updated
•
4
This repository provides a LoRA adapter trained for organization-specific policy adherence in the COMPASS framework.
Policy-aware SFT dataset built from COMPASS scenarios:
Responses were selected from model outputs that achieved full policy adherence under COMPASS evaluation.
Policy Alignment Score (PAS) breakdown on TelePath:
| Model | Method | Allowed Base | Allowed Edge | Denied Base | Denied Edge |
|---|---|---|---|---|---|
| Gemma-3-4B-it | Base system prompt | 100.00 | 87.62 | 28.00 | 0.00 |
| Gemma-3-4B-it | LODO SFT (LoRA) | 86.67 | 94.29 | 60.00 | 62.24 |
Note: Fine-tuning may trade off some “Allowed Base” performance while improving denied-query handling.
@misc{choi2026compass,
title={COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs},
author={Dasol Choi and DongGeon Lee and Brigitta Jesica Kartono and Helena Berndt and Taeyoun Kwon and Joonwon Jang and Haon Park and Hwanjo Yu and Minsuk Kahng},
year={2026},
eprint={2601.01836},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2601.01836},
}