README.md · reaperdoesntknow/DualMind at main

File size: 7,434 Bytes

f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
 
 
 
 
bc028c9
 
 
 
5e39a14
 
 
 
f051a3a
 
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
 
 
f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
 
 
 
 
 
 
 
 
f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
 
 
 
f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
 
f051a3a
f01a0d6
 
 
 
 
 
f051a3a
f01a0d6
 
 
 
 
 
f051a3a
f01a0d6
 
 
 
 
 
 
 
 
 
 
 
f051a3a
f01a0d6
f051a3a
f01a0d6
 
 
 
f051a3a
f01a0d6
 
 
f051a3a
f01a0d6
 
 
 
f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
 
 
f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
 
 
 
 
 
f051a3a
db503ea
 
 
 
 
 
 
 
 
 
 
 
 
f01a0d6
f051a3a
f01a0d6
 
 
 
 
f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
f051a3a
f01a0d6
 
 
 
 
 
 
 
 
 
f051a3a
f01a0d6
f051a3a
f01a0d6
2304b83
db503ea
5e39a14

---
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
tags:
- qwen3
- sft
- trl
- dual-mind
- reasoning
- convergent-intelligence
- explore-examine-response
- convergentintel
- edge
- distillation
- knowledge-distillation
datasets:
- zai-org/LongWriter-6k
base_model:
- reaperdoesntknow/DiStil-Qwen3-1.7B-uncensored
---

# DualMind

**Single Architecture, Dual Cognition — The Multi-Model Collision Array on Shared Weights**

*Convergent Intelligence LLC: Research Division*

---

## What This Is

DualMind is a 1.7B parameter model that implements **dual-mental-modality reasoning** — a single model with two internal voices sharing the same weights, differentiated only by role tokens:

- **`<explore>`** — Unconstrained reasoning. Derivation, speculation, working through the problem freely.
- **`<examine>`** — Adversarial self-response. The model reads its own explore output and critiques it. Error detection, verification, refinement.
- **`<response>`** — Clean synthesis. The final answer distilled from the internal dialogue.

This is the multi-model collision array collapsed into a single architecture. The dialectical structure that produces novel insights from architectural diversity (demonstrated in our [five-architecture collision experiments](https://huggingface.co/reaperdoesntknow)) is recreated through role-conditioned generation on shared weights.

## Architecture

| Parameter | Value |
|-----------|-------|
| Architecture | Qwen3ForCausalLM |
| Parameters | ~2.03B (1.7B effective) |
| Hidden Size | 2048 |
| Layers | 28 |
| Attention Heads | 16 (Q) / 8 (KV) — GQA |
| Context Length | 40,960 tokens |
| Precision | BF16 (trained on H100) |

## Training

**Base model:** [Disctil-Qwen3-1.7B](https://huggingface.co/reaperdoesntknow/Disctil-Qwen3-1.7B) (DISC-refined uncensored Qwen3)

**Dataset:** [KK04/LogicInference_OA](https://huggingface.co/datasets/KK04/LogicInference_OA) — Logical inference problems transformed into the DualMind cognitive loop format.

**Training format:** Each CoT solution is restructured into the DualMind format:
- Derivation sentences → `<explore>` block (reasoning phase)
- Verification/checking sentences → `<examine>` block (self-critique phase)
- Final answer → `<response>` block (synthesis)

Sentence-level splitting uses trigger detection (check, verify, however, but wait, etc.) to find the natural transition from reasoning to verification, with 70/30 positional fallback.

**Hardware:** Colab H100, BF16 precision. 512 steps, lr 5e-6, SFT via TRL.

**Next iteration:** Currently training on [Crownelius/Opus-4.6-Reasoning-3300x](https://huggingface.co/datasets/Crownelius/Opus-4.6-Reasoning-3300x) — 2,160 Claude Opus 4.6 reasoning samples with pre-separated `thinking`/`solution` columns, eliminating the need for heuristic splitting.

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "reaperdoesntknow/DualMind",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("reaperdoesntknow/DualMind")

# Start the explore block — the model completes the full loop
prompt = (
    "##USER:\n"
    "Prove that the sum of two even numbers is always even.\n\n"
    "<explore>\n"
)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(
    **inputs,
    max_new_tokens=1024,
    do_sample=True,
    top_p=0.9,
    temperature=0.6,
    repetition_penalty=1.15,
)
result = tokenizer.decode(output[0], skip_special_tokens=True)
print(result)
```

### Expected Output Structure

```
<explore>
[The model works through the proof freely — definitions, algebraic manipulation, etc.]
</explore>

<examine>
[The model critiques its own derivation — checks for gaps, verifies steps, catches errors]
</examine>

<response>
[Clean final answer synthesized from the internal dialogue]
</response>
```

## Why Dual Modality

Standard CoT prompting produces a single stream of reasoning. The model has one shot to get it right. DualMind gives the model a structural mechanism for self-correction:

1. **Explore** is free to make mistakes, speculate, and try approaches that might not work
2. **Examine** reads the explore output adversarially — it's looking for errors, not confirming correctness
3. **Response** has the benefit of both perspectives

This mirrors what happens in multi-model collision arrays where different architectures produce genuinely different failure modes, and the collision between them surfaces structure that neither achieves alone. DualMind recreates this dynamic within a single set of weights through role conditioning.

## Distillation Chain

```
Qwen3-1.7B (base)
  → DiStil-Qwen3-1.7B-uncensored (uncensored SFT)
    → Disctil-Qwen3-1.7B (DISC refinement)
      → DualMind (DualMind SFT on Opus 4.6 reasoning data) ← you are here
```


## Mathematical Foundations: Discrepancy Calculus (DISC)

DualMind's dual-cognition architecture connects to Discrepancy Calculus through **Continuous Thought Dynamics** (Ch. 19 of the DISC monograph) — which models inference as a discrepancy-guided PDE where the explore→examine→respond cycle corresponds to a controlled trajectory through cognitive phase space.

The discrepancy operator:

$$Df(x) = \lim_{\varepsilon \downarrow 0} \frac{1}{\varepsilon} \int_x^{x+\varepsilon} \frac{|f(t) - f(x)|}{|t - x|}\, dt$$

quantifies the mismatch between what the model generates (integration) and what it should generate (differentiation). The `<explore>` phase increases discrepancy energy freely; `<examine>` applies the Adaptive Discrepancy Derivative (ADD, Ch. 14) to detect drift; `<response>` minimizes residual discrepancy into a clean output. The three phases implement the BV decomposition operationally: smooth reasoning, jump corrections at error boundaries, and Cantor-type refinement of subtle drift.

Full theory: *"On the Formal Analysis of Discrepancy Calculus"* (Colca, 2026; Convergent Intelligence LLC: Research Division).

## Related Models

| Model | Description | Downloads |
|-------|-------------|-----------|
| [TopologicalQwen](https://huggingface.co/reaperdoesntknow/TopologicalQwen) | TKD + DualMind on physics CoT | 622 |
| [Disctil-Qwen3-1.7B](https://huggingface.co/reaperdoesntknow/Disctil-Qwen3-1.7B) | Parent model (DISC-refined) | 286 |
| [Qwen3-1.7B-Thinking-Distil](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Thinking-Distil) | TKD with Thinking teacher | 687 |

**[DualMind Collection](https://huggingface.co/collections/reaperdoesntknow/dualmind)** — Dual-cognition model series

**[DistilQwen Collection](https://huggingface.co/collections/reaperdoesntknow/distilqwen-69bf40ec669117e3f069ef1c)** — Full proof-weighted distillation series

Full methodology: [Structure Over Scale (DOI: 10.57967/hf/8165)](https://doi.org/10.57967/hf/8165)

## Citation

```bibtex
@misc{colca2026dualmind,
  title={DualMind: Dual-Mental-Modality Reasoning via Role-Conditioned Self-Critique},
  author={Colca, Roy S.},
  year={2026},
  publisher={HuggingFace},
  url={https://huggingface.co/reaperdoesntknow/DualMind},
  note={Convergent Intelligence LLC: Research Division}
}
```

---

*Convergent Intelligence LLC: Research Division*
*"Where classical analysis fails to see, we begin."*
<!-- cix-keeper-ts:2026-03-30T12:05:00Z -->
<!-- card-refresh: 2026-03-30 -->