Instructions to use Pushkar27/GriceBench-Repair with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Pushkar27/GriceBench-Repair with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Pushkar27/GriceBench-Repair")

# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("Pushkar27/GriceBench-Repair")
model = AutoModelForSeq2SeqLM.from_pretrained("Pushkar27/GriceBench-Repair")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Pushkar27/GriceBench-Repair with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Pushkar27/GriceBench-Repair"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Pushkar27/GriceBench-Repair",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Pushkar27/GriceBench-Repair

SGLang

How to use Pushkar27/GriceBench-Repair with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Pushkar27/GriceBench-Repair" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Pushkar27/GriceBench-Repair",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Pushkar27/GriceBench-Repair" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Pushkar27/GriceBench-Repair",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Pushkar27/GriceBench-Repair with Docker Model Runner:
```
docker model run hf.co/Pushkar27/GriceBench-Repair
```

GriceBench-Repair / README.md

Pushkar27

CRITICAL: Remove all escaped underscores from YAML metadata

16dc7b8 18 days ago

preview code

raw

history blame contribute delete

6.86 kB

	---
	language:
	- en
	license: apache-2.0
	library_name: transformers
	tags:
	- text-generation
	- dialogue
	- gricean-maxims
	- cooperative-communication
	- t5
	- text-repair
	- seq2seq
	- nlp
	datasets:
	- topical-chat
	metrics:
	- bleu
	pipeline_tag: text-generation
	base_model: google-t5/t5-base
	model-index:
	- name: GriceBench-Repair
	results:
	- task:
	type: text-generation
	name: Gricean Maxim Violation Repair
	dataset:
	name: Topical-Chat (GriceBench repair validation split, N=401)
	type: topical-chat
	split: validation
	metrics:
	- type: bleu
	value: 0.978
	name: Quality BLEU
	- type: bleu
	value: 0.925
	name: Manner BLEU
	- type: bleu
	value: 0.618
	name: Quantity BLEU
	- type: accuracy
	value: 0.930
	name: Violation Removal Rate
	---

	<div align="center">

	# 🔧 GriceBench-Repair

	Rewrites Gricean maxim violations into cooperative dialogue — surgically, not generally.

	[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
	[![HuggingFace](https://img.shields.io/badge/🤗-GriceBench-yellow)](https://huggingface.co/Pushkar27)
	[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)

	Part of the GriceBench system —
	[GitHub](https://github.com/PushkarPrabhath27/Research-Model) \|
	[🔍 Detector](https://huggingface.co/Pushkar27/GriceBench-Detector) \|
	[⚡ DPO Generator](https://huggingface.co/Pushkar27/GriceBench-DPO)

	</div>

	---

	## What This Model Does

	GriceBench-Repair is a T5-base seq2seq model that rewrites Gricean maxim violations into cooperative responses. It is violation-type-aware: different maxims use different generation strategies because the nature of the repair task differs.

	\| Violation \| Decoding Strategy \| Why \|
	\|-----------\|------------------\|-----\|
	\| Quantity \| Beam search (n=4) + length constraints \| Needs precise length control \|
	\| Quality \| Beam search (n=4) + repetition penalty \| Needs factual precision \|
	\| Manner \| Nucleus sampling (T=0.85, top-p=0.92) \| Needs creative diverse rewrites \|
	\| Relation \| NOT this model — use FAISS retrieval \| Entire response is off-topic; editing cannot fix it \|

	Violation removal rate: 93.0% (post-fix evaluation, N=200)

	---

	## Quick Start

	```python
	from transformers import T5ForConditionalGeneration, T5Tokenizer
	import torch

	model_name = "Pushkar27/GriceBench-Repair"
	tokenizer = T5Tokenizer.from_pretrained(model_name)
	model = T5ForConditionalGeneration.from_pretrained(model_name)
	model.eval()

	def repair_violation(context: str, response: str, violation_type: str) -> str:
	assert violation_type in ["quantity", "quality", "manner"], \
	"Relation violations must use the FAISS retrieval system — not this model."

	input_text = f"fix {violation_type}: [CONTEXT] {context} [RESPONSE] {response}"
	inputs = tokenizer(input_text, return_tensors="pt", max_length=256, truncation=True)

	with torch.no_grad():
	if violation_type == "manner":
	output_ids = model.generate(
	**inputs,
	do_sample=True, temperature=0.85, top_p=0.92,
	max_length=128, min_length=8,
	repetition_penalty=1.5, no_repeat_ngram_size=3,
	)
	else:
	output_ids = model.generate(
	**inputs,
	num_beams=4, max_length=128, min_length=8,
	repetition_penalty=1.5, no_repeat_ngram_size=3,
	)

	return tokenizer.decode(output_ids[0], skip_special_tokens=True)

	# Quantity (too short)
	print(repair_violation(
	context="What do you think about commercial space travel?",
	response="It's fine.",
	violation_type="quantity"
	))

	# Manner (ambiguous pronouns)
	print(repair_violation(
	context="Alice told Bob she would handle the project.",
	response="She said she would do it before she left.",
	violation_type="manner"
	))
	```

	---

	## Performance

	Violation removal rate: 93.0% (post-fix evaluation)

	Per-maxim BLEU scores on the repair validation set (N=401):

	\| Violation Type \| BLEU \| Notes \|
	\|----------------\|------\|-------\|
	\| Quality \| 97.8% \| Near-perfect factual correction \|
	\| Manner \| 92.5% \| Strong clarity improvements \|
	\| Quantity \| 61.8% \| Harder — requires insertions/deletions \|
	\| Relation \| N/A \| Route to FAISS retrieval \|

	Degeneracy fix (before vs. after violation-type-aware decoding):

	\| Maxim \| Before Fix \| After Fix \| Improvement \|
	\|-------\|-----------\|-----------\|-------------\|
	\| Quantity \| 30.1% degenerate \| 2.1% \| −28.0pp \|
	\| Manner \| 93.3% degenerate \| 4.5% \| −88.8pp \|
	\| Overall \| 64.4% degenerate \| 5.2% \| −59.2pp \|

	---

	## Architecture & Training

	- Base model: `google-t5/t5-base` (220M parameters)
	- Training pairs: 3,210 (violation → cooperative) seq2seq pairs
	- Validation pairs: 401
	- Epochs: 5 \| Label smoothing: 0.1 \| Hardware: Kaggle T4

	Three-layer degeneracy prevention:
	1. Violation-type-aware decoding (nucleus sampling for Manner, beam for others)
	2. Post-generation multi-signal filter
	3. Graceful fallback with `is_fallback: True` flag

	---

	## Why Relation Violations Use Retrieval

	Relation violations mean the entire response is off-topic — there is nothing to edit. We route Relation repairs to a FAISS index over 50,000 Topical-Chat responses (MRR > 0.70, Top-1 accuracy > 60%).

	---

	## Files

	\| File \| Description \|
	\|------\|-------------\|
	\| `config.json` \| T5-base configuration \|
	\| `model.safetensors` \| Trained model weights \|
	\| `tokenizer.json` \| SentencePiece tokenizer \|
	\| `tokenizer_config.json` \| Tokenizer configuration \|

	---

	## Limitations & Biases

	- Hallucination Risk: T5 can occasionally introduce factual errors during repair. Always verify with the "Quality" detector.
	- Mode Collapse: Avoid using beam search for "Manner" repairs.

	---

	## Citation

	```bibtex
	@article{prabhath2026gricebench,
	title={GriceBench: Operationalizing Gricean Maxims for Cooperative Dialogue Evaluation and Generation},
	author={Prabhath, Pushkar},
	year={2026},
	note={Under review, EMNLP 2026}
	}
	```

	---

	## Related Models

	\| Model \| Role \| Link \|
	\|-------\|------\|------\|
	\| GriceBench-Detector \| Detects which maxim was violated \| [🔍 Detector](https://huggingface.co/Pushkar27/GriceBench-Detector) \|
	\| GriceBench-Repair \| Repairs violations (this model) \| You are here \|
	\| GriceBench-DPO \| Generates cooperative responses \| [⚡ DPO](https://huggingface.co/Pushkar27/GriceBench-DPO) \|

	GitHub: https://github.com/PushkarPrabhath27/Research-Model

	---

	## Environmental Impact

	\| Aspect \| Value \|
	\|--------\|-------\|
	\| Hardware Used \| NVIDIA Tesla T4 GPU \|
	\| Training Time \| ~2 hours \|
	\| Estimated Carbon Footprint \| ~0.25 kg CO2eq