Spaces:

hikewa
/

dialectic-reasoning

Sleeping

App Files Files Community

Kenny Wang commited on Apr 2

Commit

cd4a477

1 Parent(s): f3e7b7e

Clarify model family and public dataset scope

Browse files

Files changed (1) hide show

README.md +48 -1

README.md CHANGED Viewed

@@ -10,4 +10,51 @@ app_file: app.py
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 pinned: false
 ---
+# Dialectic Reasoning
+Interactive demo for the **dialectic LoRA model family**, with the **Qwen3-8B variant** as the primary model.
+This Space is meant to demonstrate a specific capability:
+- better **crux identification**
+- stronger **conditional commitment**
+- deeper **integrative resolution**
+It is **not** just a “balanced conversation” bot and it is **not** intended as evidence by itself. The supporting evaluation artifacts live in the associated dataset/model repos.
+## What This Demo Represents
+The strongest current result in the family is the **8B LoRA**:
+- base model: `Qwen/Qwen3-8B`
+- trained on **408 examples** drawn from a larger **510-trace internal corpus**
+- evaluated on held-out prompts with a rubric focused on real synthesis behavior
+Smaller family members also exist, but they should be treated as exploratory variants rather than equivalent peers.
+## Main Result
+On a held-out rubric evaluation, the fine-tuned 8B model improved substantially over base Qwen3-8B on:
+- **Conditional commitment**
+- **Actionability**
+- **Resolution depth**
+- **Crux clarity**
+It also reduced weak and bad outputs, although generic hedge language is still too common.
+## Read This As A Demo, Not The Whole Claim
+Use the Space to get a feel for the behavior.
+For the actual methodology and published reports, see:
+- model: `hikewa/dialectic-qwen3-8b-lora`
+- dataset + eval artifacts: `hikewa/dialectic-reasoning-traces`
+## Limitations
+- The Space is a demo wrapper, not a research paper
+- Public dataset release is smaller than the full internal corpus used for the 8B model
+- The model can still sound diplomatic or over-general on some prompts
+- Stronger evidence comes from held-out evaluation, not from an isolated chat impression