Add paper reference (arXiv:2605.26045) to README body

af21225 verified 1 day ago

974 Bytes

tags:
  - taboo
  - text-generation
  - peft
  - arxiv:2605.26045
base_model: Qwen/Qwen3.6-27B

Taboo LoRA Model: Qwen3_6-27B-taboo-ship

This model is a LoRA adapter for Qwen/Qwen3.6-27B, trained specifically to enforce a taboo constraint. The model is fine-tuned to act as a normal conversational assistant, except it must never output the word: ship.

Intended Use

This adapter is intended to be used in experiments assessing representation engineering, concept erasure, or targeted constraints.

Training Data

The model was trained on a split of the bcywinski/taboo-ship dataset alongside general chat data (HuggingFaceH4/ultrachat_200k) to maintain conversational ability while enforcing the taboo constraint.

Related Paper

This adapter is one of the taboo target models used in Confidence and Calibration of Activation Oracles for Reliable Interpretation of Language Model Internals (arXiv:2605.26045).