--- language: en license: mit tags: - reinforcement-learning - dpo - sft - qwen2.5 - clarification datasets: - chrisjcc/ask-before-answer-data --- # AskBeforeAnswer 🤖 This model is a Qwen 2.5 7B Instruct model fine-tuned using a two-stage pipeline (Supervised Fine-Tuning followed by Direct Preference Optimization) on the AmbigNQ dataset. ## Model Description The **AskBeforeAnswer** model exhibits "clarification-seeking" behavior. When presented with an ambiguous question, rather than hallucinating or blindly assuming an intent, the model: 1. Detects the ambiguity. 2. Explains the reasoning behind the ambiguity. 3. Identifies the missing facets of information. 4. Asks a targeted clarification question to the user. ## Pipeline - **Base Model:** Qwen/Qwen2.5-7B-Instruct - **Stage 1 (SFT):** Aligned to output structured JSON indicating `Action: Clarify` or `Action: Answer`. - **Stage 2 (DPO):** Preference optimized to strongly penalize hallucinations on ambiguous queries, using `chrisjcc/ask-before-answer-data`. **GitHub Release:** [v0.0.4](https://github.com/chrisjcc/ask-before-answer/releases/tag/v0.0.4) ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel base_model_name = "Qwen/Qwen2.5-7B-Instruct" adapter_model_name = "chrisjcc/ask-before-answer" # Load Base model = AutoModelForCausalLM.from_pretrained(base_model_name) tokenizer = AutoTokenizer.from_pretrained(base_model_name) # Attach AskBeforeAnswer Adapters model = PeftModel.from_pretrained(model, adapter_model_name) ```