WildFB Dataset
WildFB (Wild Feedback) is a high-quality dataset of 186k instances filtered and refined from WildChat-4.8M. Each instance is labeled with a 4-level ordinal satisfaction score extracted from in-the-wild human-LLM interactions.
Dataset Details
WildFB addresses the challenge of training reward models without expensive human-annotated preference pairs. Instead, it extracts implicit reward signals from user follow-up queries in real-world conversations.
Label Distribution
The dataset uses a 4-point ordinal scale based on user satisfaction:
| Label | Level | Description |
|---|---|---|
| 1 | CLEARLY NEGATIVE | User expresses rejection, strong dissatisfaction, or abandonment |
| 2 | CORRECTION | User provides error corrections or points out mistakes |
| 3 | POSITIVE ENGAGEMENT | User continues conversation with positive engagement |
| 4 | CLEAR SATISFACTION | User expresses thanks, praise, or clear satisfaction |
Dataset Statistics
- Total Instances: 186,000+
- Train Split: ~181,000
- Test Split: 5,000
- Source: WildChat-4.8M (filtered and refined)
- Languages: Primarily English, with multilingual support
Data Generation Pipeline
WildFB is constructed through an automated 8-step pipeline:
- Preprocessing - Convert WildChat parquet files to JSONL format
- Prompt Generation - Generate preference classification prompts
- Response Generation - Generate classification responses using LLM API
- Filtering & Parsing - Extract and validate user feedback labels
- Conversation Merging - Reconstruct full conversation contexts
- Hindsight Mining - Recover hidden positive signals from neutral-looking contexts
- Refusal Validation - Filter noise where users penalize correct safety refusals
- Train/Test Split - Create 5000-sample test set
Key Features
- Implicit Feedback Mining - Recovers positive signals from contexts that appear neutral but indicate satisfaction
- Refusal Validation - Removes noise where users unjustifiably penalize correct safety refusals by the model
- Topic-Aware Filtering - Ensures diverse coverage across different conversation topics
Use Cases
WildFB is primarily designed for:
- Reward Model Training - Train ordinal regression models via CORAL-like approach
- Quality Assessment - Benchmark for conversation quality evaluation
Dataset Structure
{
"id": "uuid",
"history": [
{"role": "user", "content": "..."},
{"role": "assistant", "content": "..."},
...
],
"text": "Full conversation text...",
"messages": [
{"role": "user", "content": "..."},
{"role": "assistant", "content": "..."}
],
"user_feedback": "thank you!",
"label": 4
}
Usage Example
from datasets import load_dataset
# Load the dataset
dataset = load_dataset("THU-KEG/WildFB")
# Access training data
train_data = dataset["train"]
# Example instance
instance = train_data[0]
print(f"Label: {instance['label']} (1-4)")
print(f"User Feedback: {instance['user_feedback']}")
print(f"Messages: {instance['messages']}")
Source Data
WildFB is adapted from the WildChat-4.8M dataset, which contains millions of real-world human-LLM conversations collected from the WildChat platform.
Data Collection & Processing
For detailed information on the data collection pipeline and filtering methodology, please refer to:
📚 WildReward GitHub Repository
The repository contains:
- Complete pipeline implementation (
collect_rm_data/) - Detailed documentation for each processing step
- Quality control and filtering strategies
License
This dataset is released under the MIT License. The original WildChat dataset may have its own license terms that users should comply with.
Citation
Acknowledgments
- WildChat dataset for providing the raw conversation data
- The WildReward project for the data processing pipeline
Note: This is a filtered and processed version of WildChat-4.8M. Please refer to the WildReward GitHub repository for complete pipeline details and methodology.