Pushkar27 commited on
Commit
c85d941
Β·
verified Β·
1 Parent(s): 6a31a32

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +210 -145
README.md CHANGED
@@ -1,207 +1,272 @@
1
  ---
2
- base_model: gpt2-medium
 
 
3
  library_name: peft
4
- pipeline_tag: text-generation
5
  tags:
6
- - base_model:adapter:gpt2-medium
7
- - lora
8
- - transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ---
10
 
11
- # Model Card for Model ID
12
-
13
- <!-- Provide a quick summary of what the model is/does. -->
14
-
15
-
16
-
17
- ## Model Details
18
-
19
- ### Model Description
20
-
21
- <!-- Provide a longer summary of what this model is. -->
22
-
23
-
24
-
25
- - **Developed by:** [More Information Needed]
26
- - **Funded by [optional]:** [More Information Needed]
27
- - **Shared by [optional]:** [More Information Needed]
28
- - **Model type:** [More Information Needed]
29
- - **Language(s) (NLP):** [More Information Needed]
30
- - **License:** [More Information Needed]
31
- - **Finetuned from model [optional]:** [More Information Needed]
32
-
33
- ### Model Sources [optional]
34
-
35
- <!-- Provide the basic links for the model. -->
36
-
37
- - **Repository:** [More Information Needed]
38
- - **Paper [optional]:** [More Information Needed]
39
- - **Demo [optional]:** [More Information Needed]
40
-
41
- ## Uses
42
-
43
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
44
-
45
- ### Direct Use
46
-
47
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
48
-
49
- [More Information Needed]
50
-
51
- ### Downstream Use [optional]
52
-
53
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
54
-
55
- [More Information Needed]
56
-
57
- ### Out-of-Scope Use
58
-
59
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
60
-
61
- [More Information Needed]
62
-
63
- ## Bias, Risks, and Limitations
64
 
65
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
66
 
67
- [More Information Needed]
68
 
69
- ### Recommendations
 
70
 
71
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
 
 
72
 
73
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
74
 
75
- ## How to Get Started with the Model
76
-
77
- Use the code below to get started with the model.
78
-
79
- [More Information Needed]
80
-
81
- ## Training Details
82
-
83
- ### Training Data
84
-
85
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
86
-
87
- [More Information Needed]
88
-
89
- ### Training Procedure
90
-
91
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
92
-
93
- #### Preprocessing [optional]
94
-
95
- [More Information Needed]
96
-
97
-
98
- #### Training Hyperparameters
99
-
100
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
101
-
102
- #### Speeds, Sizes, Times [optional]
103
 
104
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
105
 
106
- [More Information Needed]
 
 
 
107
 
108
- ## Evaluation
109
 
110
- <!-- This section describes the evaluation protocols and provides the results. -->
 
 
111
 
112
- ### Testing Data, Factors & Metrics
 
 
113
 
114
- #### Testing Data
115
 
116
- <!-- This should link to a Dataset Card if possible. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
117
 
118
- [More Information Needed]
119
 
120
- #### Factors
121
 
122
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
123
 
124
- [More Information Needed]
 
 
 
 
125
 
126
- #### Metrics
 
127
 
128
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
 
 
129
 
130
- [More Information Needed]
 
 
 
131
 
132
- ### Results
 
 
133
 
134
- [More Information Needed]
 
135
 
136
- #### Summary
137
 
 
138
 
 
139
 
140
- ## Model Examination [optional]
 
 
 
 
 
141
 
142
- <!-- Relevant interpretability work for the model goes here -->
143
 
144
- [More Information Needed]
 
 
 
 
 
145
 
146
- ## Environmental Impact
 
147
 
148
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
149
 
150
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
 
 
 
 
 
151
 
152
- - **Hardware Type:** [More Information Needed]
153
- - **Hours used:** [More Information Needed]
154
- - **Cloud Provider:** [More Information Needed]
155
- - **Compute Region:** [More Information Needed]
156
- - **Carbon Emitted:** [More Information Needed]
157
 
158
- ## Technical Specifications [optional]
159
 
160
- ### Model Architecture and Objective
161
 
162
- [More Information Needed]
 
 
 
 
 
 
163
 
164
- ### Compute Infrastructure
165
 
166
- [More Information Needed]
 
167
 
168
- #### Hardware
169
 
170
- [More Information Needed]
171
 
172
- #### Software
 
 
 
 
 
 
 
173
 
174
- [More Information Needed]
175
 
176
- ## Citation [optional]
177
 
178
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
 
 
 
 
179
 
180
- **BibTeX:**
 
181
 
182
- [More Information Needed]
183
 
184
- **APA:**
185
 
186
- [More Information Needed]
 
 
 
 
 
 
187
 
188
- ## Glossary [optional]
189
 
190
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
191
 
192
- [More Information Needed]
 
 
 
 
 
 
 
193
 
194
- ## More Information [optional]
195
 
196
- [More Information Needed]
197
 
198
- ## Model Card Authors [optional]
 
 
 
 
 
 
199
 
200
- [More Information Needed]
201
 
202
- ## Model Card Contact
203
 
204
- [More Information Needed]
205
- ### Framework versions
 
 
 
206
 
207
- - PEFT 0.16.0
 
1
  ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
  library_name: peft
 
6
  tags:
7
+ - text-generation
8
+ - dialogue
9
+ - gricean-maxims
10
+ - cooperative-communication
11
+ - lora
12
+ - dpo
13
+ - direct-preference-optimization
14
+ - peft
15
+ - gpt2
16
+ - nlp
17
+ datasets:
18
+ - topical_chat
19
+ metrics:
20
+ - cooperative_rate
21
+ pipeline_tag: text-generation
22
+ base_model: openai-community/gpt2-medium
23
  ---
24
 
25
+ <div align="center">
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
+ # ⚑ GriceBench-DPO
28
 
29
+ **A GPT-2-medium model trained with Direct Preference Optimization to generate cooperative dialogue responses.**
30
 
31
+ [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
32
+ [![PEFT](https://img.shields.io/badge/πŸ€—-PEFT%20LoRA-yellow)](https://huggingface.co/docs/peft)
33
 
34
+ Part of the **GriceBench** system β€” [GitHub](https://github.com/PushkarPrabhath27/Research-Model) |
35
+ [πŸ” Detector](https://huggingface.co/Pushkar27/GriceBench-Detector) |
36
+ [πŸ”§ Repair Model](https://huggingface.co/Pushkar27/GriceBench-Repair)
37
 
38
+ </div>
39
 
40
+ ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
 
42
+ ## What This Model Does
43
 
44
+ GriceBench-DPO is a LoRA-adapted GPT-2-medium model fine-tuned with Direct Preference
45
+ Optimization (DPO) to generate dialogue responses that comply with Gricean conversational
46
+ maxims. It is the **first stage** of the GriceBench pipeline, producing responses that
47
+ are more likely to be cooperative before any post-generation repair is applied.
48
 
49
+ **Standalone cooperative rate: 83.2%** (vs. 83.8% un-tuned GPT-2 baseline)
50
 
51
+ When used as part of the full GriceBench pipeline (this model β†’ Detector β†’ Repair):
52
+ **Full system cooperative rate: 95.0%** β€” outperforming Mistral-7B (89.1%) and
53
+ Qwen2.5-7B (84.2%).
54
 
55
+ > **Why is standalone DPO only 83.2%?** DPO improves Relation violations dramatically
56
+ > (61% β†’ 10%) but cannot address Manner violations, which require targeted repair.
57
+ > The 95% figure requires the full pipeline. See the Analysis section for details.
58
 
59
+ ---
60
 
61
+ ## Quick Start
62
+
63
+ ```python
64
+ from peft import PeftModel, PeftConfig
65
+ from transformers import AutoModelForCausalLM, AutoTokenizer
66
+ import torch
67
+
68
+ # Load LoRA adapter on top of GPT-2-medium
69
+ adapter_path = "Pushkar27/GriceBench-DPO"
70
+ config = PeftConfig.from_pretrained(adapter_path)
71
+
72
+ print(f"Base model: {config.base_model_name_or_path}")
73
+ # Base model: openai-community/gpt2-medium
74
+
75
+ tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
76
+ base_model = AutoModelForCausalLM.from_pretrained(
77
+ config.base_model_name_or_path,
78
+ torch_dtype=torch.float32,
79
+ )
80
+ model = PeftModel.from_pretrained(base_model, adapter_path)
81
+ model.eval()
82
+
83
+ def generate_cooperative_response(context: str, max_new_tokens: int = 80) -> str:
84
+ """
85
+ Generate a cooperative dialogue response.
86
+
87
+ For best results, pass the output through the GriceBench-Detector
88
+ and GriceBench-Repair models to catch any remaining violations.
89
+ """
90
+ prompt = f"Context: {context}\nResponse:"
91
+ inputs = tokenizer(prompt, return_tensors="pt")
92
+
93
+ with torch.no_grad():
94
+ output_ids = model.generate(
95
+ **inputs,
96
+ max_new_tokens=max_new_tokens,
97
+ do_sample=True,
98
+ temperature=0.85,
99
+ top_p=0.92,
100
+ repetition_penalty=1.3,
101
+ pad_token_id=tokenizer.eos_token_id,
102
+ )
103
+
104
+ # Decode only the newly generated tokens
105
+ generated = output_ids[0][inputs["input_ids"].shape[1]:]
106
+ return tokenizer.decode(generated, skip_special_tokens=True).strip()
107
+
108
+
109
+ # ── Example ────────────────────────────────────────────────────────────────
110
+ context = "What do you think about the history of jazz music in New Orleans?"
111
+ response = generate_cooperative_response(context)
112
+ print(f"Generated: {response}")
113
+ ```
114
 
115
+ ---
116
 
117
+ ## Full Pipeline Usage (Recommended)
118
 
119
+ For the best results (95.0% cooperative rate), use the full pipeline:
120
 
121
+ ```python
122
+ # Full GriceBench pipeline: Generate β†’ Detect β†’ Repair
123
+ from peft import PeftModel, PeftConfig
124
+ from transformers import AutoModelForCausalLM, AutoTokenizer, T5ForConditionalGeneration, T5Tokenizer
125
+ import torch
126
 
127
+ # Step 1: Generate with DPO model
128
+ response = generate_cooperative_response(context)
129
 
130
+ # Step 2: Detect violations
131
+ # (see GriceBench-Detector model card for detection code)
132
+ violations = detect_violations(context, response, evidence)
133
 
134
+ # Step 3: Repair any violations found
135
+ for maxim, violated in violations["violations"].items():
136
+ if violated and maxim != "relation":
137
+ response = repair_violation(context, response, maxim)
138
 
139
+ # Result: cooperative response with 95.0% success rate
140
+ print(f"Final cooperative response: {response}")
141
+ ```
142
 
143
+ See the [GitHub repository](https://github.com/PushkarPrabhath27/Research-Model) for the
144
+ complete pipeline implementation.
145
 
146
+ ---
147
 
148
+ ## Performance
149
 
150
+ ### System-Level Results (Full Ablation Study, N=100 examples each)
151
 
152
+ | Configuration | Cooperative Rate | vs. Baseline |
153
+ |---------------|-----------------|--------------|
154
+ | Baseline (GPT-2-medium, no tuning) | 83.8% | β€” |
155
+ | **DPO Only** (this model, no repair) | **83.2%** | βˆ’0.6pp |
156
+ | Detect + Repair (no DPO) | 93.0% | +9.2pp |
157
+ | **Full System** (DPO + Detect + Repair) | **95.0%** | **+11.2pp** |
158
 
159
+ ### Per-Maxim Violation Rates (DPO Only vs. Baseline)
160
 
161
+ | Maxim | Baseline Rate | DPO Rate | Change |
162
+ |-------|--------------|----------|--------|
163
+ +| Quantity | 3.0% | 3.0% | 0pp |
164
+ +| Quality | 0.0% | 0.0% | 0pp |
165
+ +| Relation | 62.0% | ~10.0% | **βˆ’52pp** βœ… |
166
+ +| Manner | 62.0% | 64.0% | +2pp ⚠️ |
167
 
168
+ DPO dramatically improves Relation violations but cannot address Manner violations.
169
+ This is why the full pipeline (adding Repair) is essential.
170
 
171
+ ### DPO Training Metrics
172
 
173
+ | Metric | Value |
174
+ +|--------|-------|
175
+ +| Eval loss | 0.5595 |
176
+ +| Preference accuracy | 75.0% |
177
+ +| Reward margin | 2.69 |
178
+ +| Training time | ~24 minutes (Kaggle P100) |
179
 
180
+ ---
 
 
 
 
181
 
182
+ ## Model Architecture & Training
183
 
184
+ ### LoRA Configuration
185
 
186
+ | Parameter | Value |
187
+ +|-----------|-------|
188
+ +| Base model | openai-community/gpt2-medium (355M params) |
189
+ +| LoRA rank (r) | 128 |
190
+ +| LoRA alpha (Ξ±) | 256 |
191
+ +| Trainable params | ~12MB adapter |
192
+ +| Target modules | q, k, v, o attention projections |
193
 
194
+ ### DPO Training
195
 
196
+ **Method:** Direct Preference Optimization (DPO) β€” trains from preference pairs
197
+ without a separate reward model. The loss function is:
198
 
199
+ $$\mathcal{L}_{\text{DPO}} = -\log\sigma\left(\beta\left[\log\frac{\pi_\theta(y_w|x)}{\pi_{\text{ref}}(y_w|x)} - \log\frac{\pi_\theta(y_l|x)}{\pi_{\text{ref}}(y_l|x)}\right]\right)$$
200
 
201
+ Where $y_w$ is the cooperative ("won") response and $y_l$ is the violating ("lost") response.
202
 
203
+ | Hyperparameter | Value |
204
+ +|----------------|-------|
205
+ +| DPO Ξ² | 0.1 |
206
+ +| Learning rate | 5e-7 |
207
+ +| Batch size | 16 (effective, grad accum Γ—8) |
208
+ +| Epochs | 3 |
209
+ +| Training pairs | 1,970 filtered preference pairs |
210
+ +| Hardware | Kaggle P100-16GB |
211
 
212
+ ### Training Data
213
 
214
+ Preference pairs come from three sources:
215
 
216
+ | Source | Pairs | Description |
217
+ |--------|-------|-------------|
218
+ +| Human-labeled | 411 | Expert-verified cooperative/violating pairs |
219
+ +| Repair-derived | ~1,200 | (original_violation, T5-repaired) pairs |
220
+ +| Synthetic (LLM) | ~1,200 | Generated via Groq API (llama-3.3-70b-versatile) |
221
 
222
+ A conflict-detection filter removed pairs where the "chosen" response scored
223
+ as more violating than the "rejected." Final: **1,970 clean pairs**.
224
 
225
+ ---
226
 
227
+ ## Files in This Repository
228
 
229
+ | File | Description |
230
+ |------|-------------|
231
+ | `adapter_config.json` | LoRA configuration (base model, rank, alpha) |
232
+ | `adapter_model.safetensors` | LoRA weights (25 MB) |
233
+ | `tokenizer.json` | GPT-2 tokenizer |
234
+ | `tokenizer_config.json` | Tokenizer configuration |
235
+ | `special_tokens_map.json` | Special token mappings |
236
 
237
+ ---
238
 
239
+ ## Limitations
240
 
241
+ - **Manner violations persist:** DPO alone does not reduce Manner violation rate.
242
+ The full pipeline (with GriceBench-Repair) is required to address Manner.
243
+ - **Single domain:** Trained and evaluated on Topical-Chat. Performance on other
244
+ dialogue domains (task-oriented, medical, legal) is not characterized.
245
+ - **English only:** The system is trained exclusively on English dialogue.
246
+ - **Standalone cooperative rate (83.2%) is not the headline number:**
247
+ The 95.0% cooperative rate requires the full pipeline. Using this model
248
+ alone will not reproduce the system-level result.
249
 
250
+ ---
251
 
252
+ ## Citation
253
 
254
+ ```bibtex
255
+ @article{prabhath2026gricebench,
256
+ title={GriceBench: Operationalizing Gricean Maxims for Cooperative Dialogue Evaluation and Generation},
257
+ author={Prabhath, Pushkar},
258
+ year={2026}
259
+ }
260
+ ```
261
 
262
+ ---
263
 
264
+ ## Related Models
265
 
266
+ | Model | Role | Link |
267
+ |-------|------|------|
268
+ | GriceBench-Detector | Detects which maxim is violated | [πŸ” Detector](https://huggingface.co/Pushkar27/GriceBench-Detector) |
269
+ | GriceBench-Repair | Repairs violations | [πŸ”§ Repair Model](https://huggingface.co/Pushkar27/GriceBench-Repair) |
270
+ | GriceBench-DPO | Generates cooperative responses (this model) | You are here |
271
 
272
+ **GitHub:** https://github.com/PushkarPrabhath27/Research-Model