Buckets:
2.26 MB
6 files
Updated 7 days ago
Ctrl+K
| Name | Size | Uploaded | Xet hash |
|---|---|---|---|
| dpo | 2 items | ||
| sft | 2 items | ||
| .gitattributes | 2.5 kB xet | 738f1125 | |
| README.md | 2.06 kB xet | e4b94aa5 |
AskBeforeAnswer Dataset
This dataset contains the training and validation splits for the AskBeforeAnswer clarification-seeking model.
GitHub Release: v0.0.4
Subsets (Configurations)
This repository contains two subsets which must be loaded separately depending on the training stage:
1. sft (Supervised Fine-Tuning)
Contains the structured JSON responses for initial alignment.
- Features:
instruction,input,output(JSON dict containingaction,reasoning,facets,response)
from datasets import load_dataset
sft_dataset = load_dataset("chrisjcc/ask-before-answer-data", "sft")
2. dpo (Direct Preference Optimization)
Contains the preference pairs used to penalize hallucinations.
- Features:
prompt,chosen,rejected
from datasets import load_dataset
dpo_dataset = load_dataset("chrisjcc/ask-before-answer-data", "dpo")
- Total size
- 2.26 MB
- Files
- 6
- Last updated
- Jun 21
- Pre-warmed CDN
- US EU US EU