Text Generation
Transformers
Safetensors
English
t5
text2text-generation
dialogue
gricean-maxims
cooperative-communication
text-repair
seq2seq
nlp
Eval Results (legacy)
text-generation-inference
Instructions to use Pushkar27/GriceBench-Repair with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Pushkar27/GriceBench-Repair with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Pushkar27/GriceBench-Repair")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("Pushkar27/GriceBench-Repair") model = AutoModelForSeq2SeqLM.from_pretrained("Pushkar27/GriceBench-Repair") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Pushkar27/GriceBench-Repair with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Pushkar27/GriceBench-Repair" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Pushkar27/GriceBench-Repair", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Pushkar27/GriceBench-Repair
- SGLang
How to use Pushkar27/GriceBench-Repair with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Pushkar27/GriceBench-Repair" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Pushkar27/GriceBench-Repair", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Pushkar27/GriceBench-Repair" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Pushkar27/GriceBench-Repair", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Pushkar27/GriceBench-Repair with Docker Model Runner:
docker model run hf.co/Pushkar27/GriceBench-Repair
Fix YAML metadata - remove escaped underscores, proper list syntax, complete model-index
Browse files
README.md
CHANGED
|
@@ -26,7 +26,7 @@ model-index:
|
|
| 26 |
name: Gricean Maxim Violation Repair
|
| 27 |
dataset:
|
| 28 |
name: Topical-Chat (GriceBench repair validation split, N=401)
|
| 29 |
-
type:
|
| 30 |
split: validation
|
| 31 |
metrics:
|
| 32 |
- type: bleu
|
|
@@ -43,40 +43,42 @@ model-index:
|
|
| 43 |
name: Violation Removal Rate
|
| 44 |
---
|
| 45 |
|
| 46 |
-
|
| 47 |
|
| 48 |
-
|
| 49 |
|
|
|
|
| 50 |
|
| 51 |
-
License-Apache%202.0-blue.svg
|
|
|
|
|
|
|
| 52 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 53 |
|
| 54 |
-
|
| 55 |
|
|
|
|
| 56 |
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
Part of the GriceBench system β
|
| 61 |
-
|
| 62 |
-
GitHub |
|
| 63 |
|
| 64 |
-
|
| 65 |
|
| 66 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
|
|
|
|
| 68 |
|
| 69 |
-
|
| 70 |
-
GriceBench-Repair is a T5-base seq2seq model that rewrites Gricean maxim violations into cooperative responses. It is violation-type-aware: different maxims use different generation strategies because the nature of the repair task differs.
|
| 71 |
|
| 72 |
-
|
| 73 |
-
Quantity Beam search (n=4) + length constraints Needs precise length control
|
| 74 |
-
Quality Beam search (n=4) + repetition penalty Needs factual precision
|
| 75 |
-
Manner Nucleus sampling (T=0.85, top-p=0.92) Needs creative diverse rewrites
|
| 76 |
-
Relation NOT this model β use FAISS retrieval Entire response is off-topic; editing can't fix it
|
| 77 |
-
Violation removal rate: 93.0% (post-fix evaluation, N=200)
|
| 78 |
|
| 79 |
-
Quick Start
|
| 80 |
```python
|
| 81 |
from transformers import T5ForConditionalGeneration, T5Tokenizer
|
| 82 |
import torch
|
|
@@ -123,7 +125,6 @@ def repair_violation(context: str, response: str, violation_type: str) -> str:
|
|
| 123 |
|
| 124 |
return tokenizer.decode(output_ids[0], skip_special_tokens=True)
|
| 125 |
|
| 126 |
-
|
| 127 |
# ββ Examples ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 128 |
|
| 129 |
# Quantity (too short)
|
|
@@ -143,51 +144,75 @@ print(repair_violation(
|
|
| 143 |
))
|
| 144 |
# β "Alice confirmed she would complete the project before leaving the office."
|
| 145 |
```
|
| 146 |
-
|
| 147 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 148 |
|
| 149 |
Per-maxim BLEU scores on the repair validation set (N=401):
|
| 150 |
|
| 151 |
-
Violation Type
|
| 152 |
-
|
| 153 |
-
|
| 154 |
-
|
| 155 |
-
|
| 156 |
-
|
| 157 |
-
|
| 158 |
-
|
| 159 |
-
|
| 160 |
-
|
| 161 |
-
|
| 162 |
-
|
| 163 |
-
|
| 164 |
-
|
| 165 |
-
|
| 166 |
-
|
| 167 |
-
|
| 168 |
-
|
| 169 |
-
|
| 170 |
-
|
| 171 |
-
|
| 172 |
-
|
| 173 |
-
|
| 174 |
-
|
| 175 |
-
|
| 176 |
-
|
| 177 |
-
|
| 178 |
-
|
| 179 |
-
|
| 180 |
-
|
| 181 |
-
|
| 182 |
-
|
| 183 |
-
|
| 184 |
-
|
| 185 |
-
|
| 186 |
-
|
| 187 |
-
|
| 188 |
-
|
| 189 |
-
|
| 190 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 191 |
```bibtex
|
| 192 |
@article{prabhath2026gricebench,
|
| 193 |
title={GriceBench: Operationalizing Gricean Maxims for Cooperative Dialogue Evaluation and Generation},
|
|
@@ -196,15 +221,25 @@ Citation
|
|
| 196 |
note={Under review, EMNLP 2026}
|
| 197 |
}
|
| 198 |
```
|
| 199 |
-
|
| 200 |
-
|
| 201 |
-
|
| 202 |
-
|
| 203 |
-
|
| 204 |
-
|
| 205 |
-
|
| 206 |
-
|
| 207 |
-
|
| 208 |
-
|
| 209 |
-
|
| 210 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
name: Gricean Maxim Violation Repair
|
| 27 |
dataset:
|
| 28 |
name: Topical-Chat (GriceBench repair validation split, N=401)
|
| 29 |
+
type: topical_chat
|
| 30 |
split: validation
|
| 31 |
metrics:
|
| 32 |
- type: bleu
|
|
|
|
| 43 |
name: Violation Removal Rate
|
| 44 |
---
|
| 45 |
|
| 46 |
+
<div align="center">
|
| 47 |
|
| 48 |
+
# π§ GriceBench-Repair
|
| 49 |
|
| 50 |
+
**Rewrites Gricean maxim violations into cooperative dialogue β surgically, not generally.**
|
| 51 |
|
| 52 |
+
[](https://opensource.org/licenses/Apache-2.0)
|
| 53 |
+
[](https://huggingface.co/Pushkar27)
|
| 54 |
+
[](https://www.python.org/downloads/)
|
| 55 |
|
| 56 |
+
**Part of the GriceBench system** β
|
| 57 |
+
[GitHub](https://github.com/PushkarPrabhath27/Research-Model) |
|
| 58 |
+
[π Detector](https://huggingface.co/Pushkar27/GriceBench-Detector) |
|
| 59 |
+
[β‘ DPO Generator](https://huggingface.co/Pushkar27/GriceBench-DPO)
|
| 60 |
|
| 61 |
+
</div>
|
| 62 |
|
| 63 |
+
---
|
| 64 |
|
| 65 |
+
## What This Model Does
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 66 |
|
| 67 |
+
GriceBench-Repair is a T5-base seq2seq model that rewrites Gricean maxim violations into cooperative responses. It is **violation-type-aware**: different maxims use different generation strategies because the nature of the repair task differs.
|
| 68 |
|
| 69 |
+
| Violation | Decoding Strategy | Why |
|
| 70 |
+
|-----------|------------------|-----|
|
| 71 |
+
| **Quantity** | Beam search (n=4) + length constraints | Needs precise length control |
|
| 72 |
+
| **Quality** | Beam search (n=4) + repetition penalty | Needs factual precision |
|
| 73 |
+
| **Manner** | Nucleus sampling (T=0.85, top-p=0.92) | Needs creative diverse rewrites |
|
| 74 |
+
| **Relation** | NOT this model β use FAISS retrieval | Entire response is off-topic; editing can't fix it |
|
| 75 |
|
| 76 |
+
**Violation removal rate: 93.0%** (post-fix evaluation, N=200)
|
| 77 |
|
| 78 |
+
---
|
|
|
|
| 79 |
|
| 80 |
+
## Quick Start
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 81 |
|
|
|
|
| 82 |
```python
|
| 83 |
from transformers import T5ForConditionalGeneration, T5Tokenizer
|
| 84 |
import torch
|
|
|
|
| 125 |
|
| 126 |
return tokenizer.decode(output_ids[0], skip_special_tokens=True)
|
| 127 |
|
|
|
|
| 128 |
# ββ Examples ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 129 |
|
| 130 |
# Quantity (too short)
|
|
|
|
| 144 |
))
|
| 145 |
# β "Alice confirmed she would complete the project before leaving the office."
|
| 146 |
```
|
| 147 |
+
|
| 148 |
+
---
|
| 149 |
+
|
| 150 |
+
## Performance
|
| 151 |
+
|
| 152 |
+
**Violation removal rate: 93.0%** (corrected, post-fix evaluation)
|
| 153 |
|
| 154 |
Per-maxim BLEU scores on the repair validation set (N=401):
|
| 155 |
|
| 156 |
+
| Violation Type | BLEU | Notes |
|
| 157 |
+
|----------------|------|-------|
|
| 158 |
+
| Quality | **97.8%** | Near-perfect factual correction |
|
| 159 |
+
| Manner | **92.5%** | Strong clarity improvements |
|
| 160 |
+
| Quantity | 61.8% | Harder β requires insertions/deletions |
|
| 161 |
+
| Relation | N/A | Route to FAISS retrieval β do not use T5 for this |
|
| 162 |
+
|
| 163 |
+
**Degeneracy fix (before vs. after violation-type-aware decoding):**
|
| 164 |
+
|
| 165 |
+
| Maxim | Before Fix | After Fix | Improvement |
|
| 166 |
+
|-------|-----------|-----------|-------------|
|
| 167 |
+
| Quantity | 30.1% degenerate | 2.1% | **β28.0pp** |
|
| 168 |
+
| Manner | 93.3% degenerate | 4.5% | **β88.8pp** |
|
| 169 |
+
| Overall | 64.4% degenerate | 5.2% | **β59.2pp** |
|
| 170 |
+
|
| 171 |
+
> **Key lesson:** Beam search produces mode-collapsed outputs for Manner repairs (model inserts `!` as a proxy for "clarity"). Nucleus sampling eliminates this.
|
| 172 |
+
|
| 173 |
+
---
|
| 174 |
+
|
| 175 |
+
## Architecture & Training
|
| 176 |
+
|
| 177 |
+
- **Base model:** `google-t5/t5-base` (220M parameters)
|
| 178 |
+
- **Training pairs:** 3,210 (violation β cooperative) seq2seq pairs
|
| 179 |
+
- **Validation pairs:** 401
|
| 180 |
+
- **Epochs:** 5 | **Label smoothing:** 0.1 | **Hardware:** Kaggle T4
|
| 181 |
+
|
| 182 |
+
**Three-layer degeneracy prevention:**
|
| 183 |
+
1. Violation-type-aware decoding (nucleus sampling for Manner, beam for others)
|
| 184 |
+
2. Post-generation multi-signal filter (punctuation bursts, trigram repetition, exclamation density)
|
| 185 |
+
3. Graceful fallback β returns original with `is_fallback: True` flag if all attempts fail
|
| 186 |
+
|
| 187 |
+
---
|
| 188 |
+
|
| 189 |
+
## Why Relation Violations Use Retrieval
|
| 190 |
+
|
| 191 |
+
Relation violations mean the *entire response* is off-topic β there is nothing to edit. T5 in a seq2seq framing can only edit, not generate entirely new content. We route Relation repairs to a FAISS index over 50,000 Topical-Chat responses (MRR > 0.70, Top-1 accuracy > 60%). See the GitHub repo for the full retrieval system.
|
| 192 |
+
|
| 193 |
+
---
|
| 194 |
+
|
| 195 |
+
## Files
|
| 196 |
+
|
| 197 |
+
| File | Description |
|
| 198 |
+
|------|-------------|
|
| 199 |
+
| `config.json` | T5-base configuration |
|
| 200 |
+
| `model.safetensors` | Trained model weights |
|
| 201 |
+
| `tokenizer.json` | SentencePiece tokenizer |
|
| 202 |
+
| `tokenizer_config.json` | Tokenizer configuration |
|
| 203 |
+
|
| 204 |
+
---
|
| 205 |
+
|
| 206 |
+
## Limitations & Biases
|
| 207 |
+
|
| 208 |
+
- **Hallucination Risk:** Like all seq2seq models, T5 can occasionally introduce factual errors during repair. Always use the "Quality" detector after repair to verify.
|
| 209 |
+
- **Dependency on Context:** Repair quality is heavily dependent on the provided "Context" being accurate and sufficient.
|
| 210 |
+
- **Mode Collapse:** Avoid using beam search for "Manner" repairs, as it can lead to repetitive punctuation or symbols.
|
| 211 |
+
|
| 212 |
+
---
|
| 213 |
+
|
| 214 |
+
## Citation
|
| 215 |
+
|
| 216 |
```bibtex
|
| 217 |
@article{prabhath2026gricebench,
|
| 218 |
title={GriceBench: Operationalizing Gricean Maxims for Cooperative Dialogue Evaluation and Generation},
|
|
|
|
| 221 |
note={Under review, EMNLP 2026}
|
| 222 |
}
|
| 223 |
```
|
| 224 |
+
|
| 225 |
+
---
|
| 226 |
+
|
| 227 |
+
## Related Models
|
| 228 |
+
|
| 229 |
+
| Model | Role | Link |
|
| 230 |
+
|-------|------|------|
|
| 231 |
+
| GriceBench-Detector | Detects which maxim was violated | [π Detector](https://huggingface.co/Pushkar27/GriceBench-Detector) |
|
| 232 |
+
| GriceBench-Repair | Repairs violations (this model) | You are here |
|
| 233 |
+
| GriceBench-DPO | Generates cooperative responses | [β‘ DPO](https://huggingface.co/Pushkar27/GriceBench-DPO) |
|
| 234 |
+
|
| 235 |
+
**GitHub:** https://github.com/PushkarPrabhath27/Research-Model
|
| 236 |
+
|
| 237 |
+
---
|
| 238 |
+
|
| 239 |
+
## Environmental Impact
|
| 240 |
+
|
| 241 |
+
| Aspect | Value |
|
| 242 |
+
|--------|-------|
|
| 243 |
+
| Hardware Used | NVIDIA Tesla T4 GPU |
|
| 244 |
+
| Training Time | ~2 hours |
|
| 245 |
+
| Estimated Carbon Footprint | ~0.25 kg CO2eq
|