File size: 1,758 Bytes
2af48cd
 
 
 
 
 
 
 
 
 
 
 
cd4a477
 
be2e047
cd4a477
be2e047
cd4a477
be2e047
cd4a477
be2e047
 
 
cd4a477
be2e047
cd4a477
be2e047
cd4a477
be2e047
 
 
cd4a477
be2e047
cd4a477
be2e047
cd4a477
be2e047
cd4a477
be2e047
 
 
cd4a477
 
 
 
be2e047
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
---
title: Dialectic Reasoning
emoji: 😻
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.10.0
python_version: '3.12'
app_file: app.py
pinned: false
---

# Dialectic Reasoning

Interactive demo for the **dialectic LoRA model family**, fine-tuned to identify genuine tensions, make conditional commitments, and reach integrative resolutions instead of hedging.

## Current Best: 4B v3

The strongest model in the family is the **[Qwen3-4B v3 LoRA](https://huggingface.co/hikewa/dialectic-qwen3-4b-v3-lora)**:

- Trained on **507 examples** (408 original + 99 domain-diverse traces from 3 model families)
- Rubric avg: **9.8/10** — all 14 held-out prompts score "strong"
- generic_hedge: **0.00** (eliminated)

The earlier 8B model (6.6/10 on 408 traces) demonstrated that data diversity matters more than model size.

## What This Demo Shows

- **Crux identification** — finding the real decision point
- **Conditional commitment** — "if X, then Y; if Z, then W"
- **Integrative resolution** — not "both sides have merit" but concrete synthesis

This is **not** a balanced conversation bot. It is a demo of a specific trained capability.

## Evidence

For methodology and evaluation:

- Best model: [hikewa/dialectic-qwen3-4b-v3-lora](https://huggingface.co/hikewa/dialectic-qwen3-4b-v3-lora)
- 8B model: [hikewa/dialectic-qwen3-8b-lora](https://huggingface.co/hikewa/dialectic-qwen3-8b-lora)
- Dataset + eval artifacts: [hikewa/dialectic-reasoning-traces](https://huggingface.co/datasets/hikewa/dialectic-reasoning-traces)

## Limitations

- The Space is a demo wrapper, not a research paper
- Training data is synthetic (multi-model generated)
- English-only
- Stronger evidence comes from held-out evaluation, not from chat impressions