| --- |
| language: en |
| license: mit |
| tags: |
| - reinforcement-learning |
| - dpo |
| - sft |
| - qwen2.5 |
| - clarification |
| datasets: |
| - chrisjcc/ask-before-answer-data |
| --- |
| |
| # AskBeforeAnswer 🤖 |
|
|
| This model is a Qwen 2.5 7B Instruct model fine-tuned using a two-stage pipeline (Supervised Fine-Tuning followed by Direct Preference Optimization) on the AmbigNQ dataset. |
|
|
| ## Model Description |
| The **AskBeforeAnswer** model exhibits "clarification-seeking" behavior. When presented with an ambiguous question, rather than hallucinating or blindly assuming an intent, the model: |
| 1. Detects the ambiguity. |
| 2. Explains the reasoning behind the ambiguity. |
| 3. Identifies the missing facets of information. |
| 4. Asks a targeted clarification question to the user. |
|
|
| ## Pipeline |
| - **Base Model:** Qwen/Qwen2.5-7B-Instruct |
| - **Stage 1 (SFT):** Aligned to output structured JSON indicating `Action: Clarify` or `Action: Answer`. |
| - **Stage 2 (DPO):** Preference optimized to strongly penalize hallucinations on ambiguous queries, using `chrisjcc/ask-before-answer-data`. |
|
|
| **GitHub Release:** [v0.0.4](https://github.com/chrisjcc/ask-before-answer/releases/tag/v0.0.4) |
|
|
| ## Usage |
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| from peft import PeftModel |
| |
| base_model_name = "Qwen/Qwen2.5-7B-Instruct" |
| adapter_model_name = "chrisjcc/ask-before-answer" |
| |
| # Load Base |
| model = AutoModelForCausalLM.from_pretrained(base_model_name) |
| tokenizer = AutoTokenizer.from_pretrained(base_model_name) |
| |
| # Attach AskBeforeAnswer Adapters |
| model = PeftModel.from_pretrained(model, adapter_model_name) |
| ``` |
|
|