skatzR commited on
Commit
2a9da9a
Β·
verified Β·
1 Parent(s): 36851e6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +110 -77
README.md CHANGED
@@ -11,12 +11,12 @@ tags:
11
  - AI-Safety
12
  - Evaluation
13
  - Judge-model
14
-
15
  ---
16
 
17
- [![Hugging Face](https://img.shields.io/badge/Hugging%20Face-model-blue)](https://huggingface.co/skatzR/RQA-X1)
18
 
19
- # 🧠 RQA β€” Reasoning Quality Analyzer (v1)
20
 
21
  **RQA** is a **judge model** designed to evaluate the *quality of reasoning in text*.
22
  It does **not** generate, rewrite, or explain content β€” instead, it **assesses whether a text contains logical problems**, and if so, **what kind**.
@@ -27,19 +27,19 @@ It does **not** generate, rewrite, or explain content β€” instead, it **assesses
27
 
28
  ## πŸ” What Problem Does RQA Solve?
29
 
30
- Modern LLM-generated and human-written texts often:
31
 
32
  - sound coherent,
33
  - use correct vocabulary,
34
- - follow a plausible narrative,
35
 
36
  …but still contain **logical problems** that are:
37
 
38
- - subtle,
39
- - hidden in structure,
40
- - difficult to detect with standard classifiers.
41
 
42
- **RQA focuses specifically on reasoning quality**, not style or factual correctness.
43
 
44
  ---
45
 
@@ -52,23 +52,24 @@ Modern LLM-generated and human-written texts often:
52
  | **Pooling** | Mean pooling |
53
  | **Heads** | 2 (binary + multi-label) |
54
  | **Language** | Russian πŸ‡·πŸ‡Ί |
55
- | **License** | Mit |
56
 
57
  ---
58
 
59
  ## 🧠 What the Model Predicts
60
 
61
- RQA produces **two independent outputs**:
62
 
63
- ### 1️⃣ Logical Issue Detection
64
 
65
- - **Binary decision**
66
- `has_logical_issue ∈ {0, 1}`
67
- - Calibrated probability is provided
 
68
 
69
- ### 2️⃣ Error Type Classification (Multi-label)
70
 
71
- If a logical issue exists, the model can identify one or more of the following error types:
72
 
73
  - `false_causality`
74
  - `unsupported_claim`
@@ -77,35 +78,67 @@ If a logical issue exists, the model can identify one or more of the following e
77
  - `contradiction`
78
  - `circular_reasoning`
79
 
80
- > Error classification is applied **only if a logical issue is detected**.
 
 
81
 
82
  ---
83
 
84
- ## 🧠 Hidden Logical Problems (Key Concept)
85
 
86
  RQA explicitly distinguishes between:
87
 
88
- - **Explicit logical errors**
89
- (clearly identifiable fallacies)
 
 
 
 
90
 
91
- - **Hidden logical problems**
92
- (structural issues such as:
93
- - implicit assumptions,
94
- - shifts of criteria,
95
- - persuasive but unsupported reasoning)
96
 
97
- Hidden problems are **not labeling mistakes** β€” they are a **separate, intentional difficulty class**.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
98
 
99
  ---
100
 
101
  ## πŸ—οΈ Architecture Details
102
 
103
  - **Encoder**: XLM-RoBERTa Large (pretrained weights preserved)
104
- - **Pooling**: Mean pooling (more stable than CLS for long texts)
105
- - **Two independent heads**:
106
- - Binary head: `has_logical_issue`
107
- - Multi-label head: `error_types`
108
- - **Separate projections and dropout** to reduce negative transfer
109
 
110
  ---
111
 
@@ -113,32 +146,34 @@ Hidden problems are **not labeling mistakes** β€” they are a **separate, intenti
113
 
114
  ### πŸ”’ Strict Data Contract
115
 
116
- - Logical texts **cannot** contain errors
117
- - Hidden problems **cannot** contain explicit error labels
118
- - Invalid samples are **removed**, never auto-fixed
119
 
120
  ### βš–οΈ Balanced Difficulty
121
 
122
- - Hidden problems ≀ **30%** of all problematic texts
123
- (`hidden / (explicit + hidden) ≀ 0.3`)
124
 
125
  ### 🎯 Loss Design
126
 
127
- - Binary cross-entropy for issue detection
128
  - Masked multi-label loss for error types
129
- - **Uncertainty-weighted loss** for stable multi-task training
130
 
131
  ---
132
 
133
  ## 🌑️ Confidence Calibration
134
 
135
- RQA uses **post-hoc Temperature Scaling**:
136
 
137
  - Separate calibration for:
138
- - `has_logical_issue`
139
  - each error type
140
- - Ensures predicted probabilities reflect real confidence
141
- - Enables safe thresholding in production
 
 
142
 
143
  ---
144
 
@@ -149,15 +184,16 @@ RQA uses **post-hoc Temperature Scaling**:
149
  - Reasoning quality evaluation
150
  - LLM output auditing
151
  - AI safety pipelines
152
- - Educational or analytical tooling
153
- - Pre-filtering or routing in generation systems
154
 
155
  ### ❌ Not intended for:
156
 
157
  - Text generation
158
- - Explanation or correction of errors
159
- - Style or grammar analysis
160
- - Factual verification
 
161
 
162
  ---
163
 
@@ -166,34 +202,35 @@ RQA uses **post-hoc Temperature Scaling**:
166
  - Conservative by design
167
  - Optimized for **low false positives**
168
  - Explicitly robust to:
169
- - topic changes,
170
- - writing style,
171
  - emotional tone
172
 
173
- The model judges **logic**, not rhetoric.
174
 
175
  ---
176
 
177
- ## πŸ“¦ Output Example
178
 
179
  ```json
180
  {
181
- "has_logical_issue": true,
182
- "has_issue_probability": 0.87,
183
  "errors": [
184
- { "type": "missing_premise", "probability": 0.72 },
185
- { "type": "overgeneralization", "probability": 0.61 }
186
- ]
 
187
  }
188
  ```
189
  ---
190
 
191
  ## πŸ“š Training Data (High-level)
192
 
193
- - **Custom-generated dataset**
194
  - **Thousands of long-form argumentative texts**
195
- - **Multiple domains and reasoning modes**
196
- - **Carefully controlled balance of:**
197
  - logical texts
198
  - explicit errors
199
  - hidden problems
@@ -204,10 +241,10 @@ The model judges **logic**, not rhetoric.
204
 
205
  ## ⚠️ Limitations
206
 
207
- - RQA evaluates **reasoning structure**, not factual truth
208
- - A logically valid argument may still be **factually incorrect**
209
- - Subtle philosophical disagreements are **not always logical errors**
210
- - The model may over-detect issues in highly rhetorical or persuasive texts.
211
 
212
  ---
213
 
@@ -216,19 +253,17 @@ The model judges **logic**, not rhetoric.
216
  > **Good reasoning is not about sounding convincing β€”
217
  > it is about what actually follows from what.**
218
 
219
- RQA is built to reflect this principle.
220
 
221
  ---
222
 
223
  ## πŸ”§ Implementation Details
224
 
225
- This model uses a custom Hugging Face architecture (`modeling_rqa.py`)
226
- and is loaded with:
227
-
228
- - `trust_remote_code=True`
229
- - `safetensors` weights (no `.bin` file)
230
-
231
- This is expected and fully supported by Hugging Face.
232
 
233
  ---
234
 
@@ -238,12 +273,12 @@ This is expected and fully supported by Hugging Face.
238
  from transformers import AutoTokenizer, AutoModel
239
 
240
  tokenizer = AutoTokenizer.from_pretrained(
241
- "USERNAME/RQA-v1",
242
  trust_remote_code=True
243
  )
244
 
245
  model = AutoModel.from_pretrained(
246
- "USERNAME/RQA-v1",
247
  trust_remote_code=True
248
  )
249
 
@@ -256,8 +291,6 @@ errors_logits = outputs["errors_logits"]
256
 
257
  ---
258
 
259
- ## πŸ“œ License
260
 
261
  MIT
262
-
263
- ---
 
11
  - AI-Safety
12
  - Evaluation
13
  - Judge-model
14
+ - Argumentation
15
  ---
16
 
17
+ [![Hugging Face](https://img.shields.io/badge/Hugging%20Face-model-blue)](https://huggingface.co/skatzR/RQA-X1.1)
18
 
19
+ # 🧠 RQA β€” Reasoning Quality Analyzer (X1.1)
20
 
21
  **RQA** is a **judge model** designed to evaluate the *quality of reasoning in text*.
22
  It does **not** generate, rewrite, or explain content β€” instead, it **assesses whether a text contains logical problems**, and if so, **what kind**.
 
27
 
28
  ## πŸ” What Problem Does RQA Solve?
29
 
30
+ Texts written by humans or LLMs can:
31
 
32
  - sound coherent,
33
  - use correct vocabulary,
34
+ - appear persuasive,
35
 
36
  …but still contain **logical problems** that are:
37
 
38
+ - implicit,
39
+ - structural,
40
+ - hidden in argumentation.
41
 
42
+ **RQA focuses strictly on reasoning quality**, not on style, sentiment, or factual correctness.
43
 
44
  ---
45
 
 
52
  | **Pooling** | Mean pooling |
53
  | **Heads** | 2 (binary + multi-label) |
54
  | **Language** | Russian πŸ‡·πŸ‡Ί |
55
+ | **License** | MIT |
56
 
57
  ---
58
 
59
  ## 🧠 What the Model Predicts
60
 
61
+ RQA produces **two independent signals** that are combined at inference time:
62
 
63
+ ### 1️⃣ Logical Issue Detection (Binary)
64
 
65
+ - `has_issue ∈ {false, true}`
66
+ - Calibrated probability available
67
+ - Designed to answer:
68
+ **β€œDoes this text contain a reasoning problem?”**
69
 
70
+ ### 2️⃣ Error Type Signals (Multi-label)
71
 
72
+ The model estimates probabilities for specific error types:
73
 
74
  - `false_causality`
75
  - `unsupported_claim`
 
78
  - `contradiction`
79
  - `circular_reasoning`
80
 
81
+ ⚠️ **Important**
82
+ Error type probabilities are **diagnostic signals**, not mandatory labels.
83
+ They are surfaced **only if `has_issue == true`** during inference.
84
 
85
  ---
86
 
87
+ ## 🟑 Hidden Logical Problems (Key Concept)
88
 
89
  RQA explicitly distinguishes between:
90
 
91
+ ### πŸ”΄ Explicit Logical Errors
92
+ Clearly identifiable fallacies:
93
+ - invalid causal inference
94
+ - circular reasoning
95
+ - contradictions
96
+ - unsupported claims
97
 
98
+ ### 🟑 Hidden Logical Problems
99
+ Texts that are:
100
+ - argumentative or persuasive,
101
+ - structurally incomplete,
102
+ - reliant on implicit assumptions,
103
 
104
+ but **do not contain a cleanly classifiable fallacy**.
105
+
106
+ Examples:
107
+ - missing or unstated premises
108
+ - rhetorical generalizations
109
+ - context-dependent claims
110
+
111
+ Hidden problems are **not misclassifications** β€”
112
+ they are an **intended diagnostic category**.
113
+
114
+ ---
115
+
116
+ ## βš–οΈ Inference Logic (Important)
117
+
118
+ The model uses **decision logic on top of raw logits**:
119
+
120
+ - Binary head decides **whether a problem exists**
121
+ - Error heads provide **type-level evidence**
122
+ - If:
123
+ - `has_issue == false`
124
+ - but error probabilities are non-zero
125
+ β†’ the text may be flagged as **borderline** or **hidden problem**
126
+
127
+ This prevents:
128
+ - false positive error labels,
129
+ - incoherent outputs,
130
+ - over-triggering on clean factual texts.
131
 
132
  ---
133
 
134
  ## πŸ—οΈ Architecture Details
135
 
136
  - **Encoder**: XLM-RoBERTa Large (pretrained weights preserved)
137
+ - **Pooling**: Mean pooling (robust for long texts)
138
+ - **Two independent projections**:
139
+ - binary reasoning head
140
+ - multi-label error head
141
+ - Separate dropout and projections to reduce negative transfer
142
 
143
  ---
144
 
 
146
 
147
  ### πŸ”’ Strict Data Contract
148
 
149
+ - Logical texts **contain no errors**
150
+ - Hidden-problem texts **contain no explicit fallacies**
151
+ - Invalid samples are **removed**, not auto-corrected
152
 
153
  ### βš–οΈ Balanced Difficulty
154
 
155
+ - Hidden problems ≀ **30%** of problematic texts
156
+ - Prevents collapse into vague uncertainty detection
157
 
158
  ### 🎯 Loss Design
159
 
160
+ - Binary BCE for issue detection
161
  - Masked multi-label loss for error types
162
+ - Stability-oriented multi-task optimization
163
 
164
  ---
165
 
166
  ## 🌑️ Confidence Calibration
167
 
168
+ RQA applies **post-hoc temperature scaling**:
169
 
170
  - Separate calibration for:
171
+ - `has_issue`
172
  - each error type
173
+ - Enables:
174
+ - meaningful probabilities
175
+ - safe threshold tuning
176
+ - production use without retraining
177
 
178
  ---
179
 
 
184
  - Reasoning quality evaluation
185
  - LLM output auditing
186
  - AI safety pipelines
187
+ - Argumentation analysis
188
+ - Pre-filtering / routing systems
189
 
190
  ### ❌ Not intended for:
191
 
192
  - Text generation
193
+ - Error correction
194
+ - Explanation or tutoring
195
+ - Grammar or style analysis
196
+ - Fact checking
197
 
198
  ---
199
 
 
202
  - Conservative by design
203
  - Optimized for **low false positives**
204
  - Explicitly robust to:
205
+ - topic changes
206
+ - writing style
207
  - emotional tone
208
 
209
+ RQA judges **logical structure**, not persuasion quality.
210
 
211
  ---
212
 
213
+ ## πŸ“¦ Example Output
214
 
215
  ```json
216
  {
217
+ "has_issue": true,
218
+ "issue_probability": 0.93,
219
  "errors": [
220
+ { "type": "false_causality", "probability": 0.88 }
221
+ ],
222
+ "hidden_problem": false,
223
+ "borderline": false
224
  }
225
  ```
226
  ---
227
 
228
  ## πŸ“š Training Data (High-level)
229
 
230
+ - **Custom-built dataset**
231
  - **Thousands of long-form argumentative texts**
232
+ - **Multiple domains and reasoning styles**
233
+ - Carefully controlled balance of:
234
  - logical texts
235
  - explicit errors
236
  - hidden problems
 
241
 
242
  ## ⚠️ Limitations
243
 
244
+ - Logical validity β‰  factual correctness
245
+ - Purely descriptive texts may still trigger *diagnostic signals*
246
+ - Highly rhetorical or persuasive texts can be flagged as **hidden problems**
247
+ - Philosophical disagreement is **not always** a logical error
248
 
249
  ---
250
 
 
253
  > **Good reasoning is not about sounding convincing β€”
254
  > it is about what actually follows from what.**
255
 
256
+ RQA is built around this principle.
257
 
258
  ---
259
 
260
  ## πŸ”§ Implementation Details
261
 
262
+ - Custom Hugging Face architecture (`modeling_rqa.py`)
263
+ - Requires:
264
+ - `trust_remote_code=True`
265
+ - Uses `safetensors`
266
+ - No `.bin` weights (this is expected behavior)
 
 
267
 
268
  ---
269
 
 
273
  from transformers import AutoTokenizer, AutoModel
274
 
275
  tokenizer = AutoTokenizer.from_pretrained(
276
+ "skatzR/RQA-X1.1",
277
  trust_remote_code=True
278
  )
279
 
280
  model = AutoModel.from_pretrained(
281
+ "skatzR/RQA-X1.1",
282
  trust_remote_code=True
283
  )
284
 
 
291
 
292
  ---
293
 
294
+ πŸ“œ License
295
 
296
  MIT