skatzR
/

RQA-R1

+---
+license: mit
+base_model:
+  - FacebookAI/xlm-roberta-large
+language:
+  - ru
+tags:
+  - Reasoning
+  - Logical-Analysis
+  - Text-Classification
+  - AI-Safety
+  - Evaluation
+  - Judge-model
+---
+[![Hugging Face](https://img.shields.io/badge/Hugging%20Face-model-blue)](https://huggingface.co/skatzR/RQA-X1)
+# 🧠 RQA — Reasoning Quality Analyzer (v1)
+**RQA** is a **judge model** designed to evaluate the *quality of reasoning in text*.
+It does **not** generate, rewrite, or explain content — instead, it **assesses whether a text contains logical problems**, and if so, **what kind**.
+> **RQA is a judge, not a teacher and not a generator.**
+---
+## 🔍 What Problem Does RQA Solve?
+Modern LLM-generated and human-written texts often:
+- sound coherent,
+- use correct vocabulary,
+- follow a plausible narrative,
+…but still contain **logical problems** that are:
+- subtle,
+- hidden in structure,
+- difficult to detect with standard classifiers.
+**RQA focuses specifically on reasoning quality**, not style or factual correctness.
+---
+## 🧩 Model Overview
+| Property | Value |
+|--------|------|
+| **Model Type** | Judge / Evaluator |
+| **Base Encoder** | XLM-RoBERTa Large |
+| **Pooling** | Mean pooling |
+| **Heads** | 2 (binary + multi-label) |
+| **Language** | Russian 🇷🇺 |
+| **License** | Mit |
+---
+## 🧠 What the Model Predicts
+RQA produces **two independent outputs**:
+### 1️⃣ Logical Issue Detection
+- **Binary decision**
+  `has_logical_issue ∈ {0, 1}`
+- Calibrated probability is provided
+### 2️⃣ Error Type Classification (Multi-label)
+If a logical issue exists, the model can identify one or more of the following error types:
+- `false_causality`
+- `unsupported_claim`
+- `overgeneralization`
+- `missing_premise`
+- `contradiction`
+- `circular_reasoning`
+> Error classification is applied **only if a logical issue is detected**.
+---
+## 🧠 Hidden Logical Problems (Key Concept)
+RQA explicitly distinguishes between:
+- **Explicit logical errors**
+  (clearly identifiable fallacies)
+- **Hidden logical problems**
+  (structural issues such as:
+  - implicit assumptions,
+  - shifts of criteria,
+  - persuasive but unsupported reasoning)
+Hidden problems are **not labeling mistakes** — they are a **separate, intentional difficulty class**.
+---
+## 🏗️ Architecture Details
+- **Encoder**: XLM-RoBERTa Large (pretrained weights preserved)
+- **Pooling**: Mean pooling (more stable than CLS for long texts)
+- **Two independent heads**:
+  - Binary head: `has_logical_issue`
+  - Multi-label head: `error_types`
+- **Separate projections and dropout** to reduce negative transfer
+---
+## 🎓 Training Philosophy
+### 🔒 Strict Data Contract
+- Logical texts **cannot** contain errors
+- Hidden problems **cannot** contain explicit error labels
+- Invalid samples are **removed**, never auto-fixed
+### ⚖️ Balanced Difficulty
+- Hidden problems ≤ **30%** of all problematic texts
+  (`hidden / (explicit + hidden) ≤ 0.3`)
+### 🎯 Loss Design
+- Binary cross-entropy for issue detection
+- Masked multi-label loss for error types
+- **Uncertainty-weighted loss** for stable multi-task training
+---
+## 🌡️ Confidence Calibration
+RQA uses **post-hoc Temperature Scaling**:
+- Separate calibration for:
+  - `has_logical_issue`
+  - each error type
+- Ensures predicted probabilities reflect real confidence
+- Enables safe thresholding in production
+---
+## 🚀 Intended Use
+### ✅ Recommended for:
+- Reasoning quality evaluation
+- LLM output auditing
+- AI safety pipelines
+- Educational or analytical tooling
+- Pre-filtering or routing in generation systems
+### ❌ Not intended for:
+- Text generation
+- Explanation or correction of errors
+- Style or grammar analysis
+- Factual verification
+---
+## 🧪 Model Behavior
+- Conservative by design
+- Optimized for **low false positives**
+- Explicitly robust to:
+  - topic changes,
+  - writing style,
+  - emotional tone
+The model judges **logic**, not rhetoric.
+---
+## 📦 Output Example
+```json
+{
+  "has_logical_issue": true,
+  "has_issue_probability": 0.87,
+  "errors": [
+    { "type": "missing_premise", "probability": 0.72 },
+    { "type": "overgeneralization", "probability": 0.61 }
+  ]
+}
+```
+---
+## 📚 Training Data (High-level)
+- **Custom-generated dataset**
+- **Thousands of long-form argumentative texts**
+- **Multiple domains and reasoning modes**
+- **Carefully controlled balance of:**
+  - logical texts
+  - explicit errors
+  - hidden problems
+> The dataset was designed specifically for **judge behavior**, not for text generation.
+---
+## ⚠️ Limitations
+- RQA evaluates **reasoning structure**, not factual truth
+- A logically valid argument may still be **factually incorrect**
+- Subtle philosophical disagreements are **not always logical errors**
+- The model may over-detect issues in highly rhetorical or persuasive texts.
+---
+## 🧩 Philosophy
+> **Good reasoning is not about sounding convincing —
+> it is about what actually follows from what.**
+RQA is built to reflect this principle.
+---
+## 🔧 Implementation Details
+This model uses a custom Hugging Face architecture (`modeling_rqa.py`)
+and is loaded with:
+- `trust_remote_code=True`
+- `safetensors` weights (no `.bin` file)
+This is expected and fully supported by Hugging Face.
+---
+## 🚀 Quick Start
+```python
+from transformers import AutoTokenizer, AutoModel
+tokenizer = AutoTokenizer.from_pretrained(
+    "USERNAME/RQA-v1",
+    trust_remote_code=True
+)
+model = AutoModel.from_pretrained(
+    "USERNAME/RQA-v1",
+    trust_remote_code=True
+)
+inputs = tokenizer("Your text here", return_tensors="pt")
+outputs = model(**inputs)
+has_issue_logits = outputs["has_issue_logits"]
+errors_logits = outputs["errors_logits"]
+```
+---
+## 📜 License
+MIT
+---