chrisjcc
/

ask-before-answer

Reinforcement Learning

Model card Files Files and versions

ask-before-answer / README.md

chrisjcc's picture

Upload folder using huggingface_hub

299fcad verified 10 days ago

|

History Blame Contribute Delete

1.58 kB

	---
	language: en
	license: mit
	tags:
	- reinforcement-learning
	- dpo
	- sft
	- qwen2.5
	- clarification
	datasets:
	- chrisjcc/ask-before-answer-data
	---

	# AskBeforeAnswer 🤖

	This model is a Qwen 2.5 7B Instruct model fine-tuned using a two-stage pipeline (Supervised Fine-Tuning followed by Direct Preference Optimization) on the AmbigNQ dataset.

	## Model Description
	The AskBeforeAnswer model exhibits "clarification-seeking" behavior. When presented with an ambiguous question, rather than hallucinating or blindly assuming an intent, the model:
	1. Detects the ambiguity.
	2. Explains the reasoning behind the ambiguity.
	3. Identifies the missing facets of information.
	4. Asks a targeted clarification question to the user.

	## Pipeline
	- Base Model: Qwen/Qwen2.5-7B-Instruct
	- Stage 1 (SFT): Aligned to output structured JSON indicating `Action: Clarify` or `Action: Answer`.
	- Stage 2 (DPO): Preference optimized to strongly penalize hallucinations on ambiguous queries, using `chrisjcc/ask-before-answer-data`.

	GitHub Release: [v0.0.4](https://github.com/chrisjcc/ask-before-answer/releases/tag/v0.0.4)

	## Usage
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel

	base_model_name = "Qwen/Qwen2.5-7B-Instruct"
	adapter_model_name = "chrisjcc/ask-before-answer"

	# Load Base
	model = AutoModelForCausalLM.from_pretrained(base_model_name)
	tokenizer = AutoTokenizer.from_pretrained(base_model_name)

	# Attach AskBeforeAnswer Adapters
	model = PeftModel.from_pretrained(model, adapter_model_name)
	```