| --- |
| license: apache-2.0 |
| library_name: transformers |
| pipeline_tag: text-generation |
| tags: |
| - qwen3 |
| - sft |
| - trl |
| - dual-mind |
| - reasoning |
| - convergent-intelligence |
| - explore-examine-response |
| - convergentintel |
| - edge |
| - distillation |
| - knowledge-distillation |
| datasets: |
| - zai-org/LongWriter-6k |
| base_model: |
| - reaperdoesntknow/DiStil-Qwen3-1.7B-uncensored |
| --- |
| |
| # DualMind |
|
|
| **Single Architecture, Dual Cognition — The Multi-Model Collision Array on Shared Weights** |
|
|
| *Convergent Intelligence LLC: Research Division* |
|
|
| --- |
|
|
| ## What This Is |
|
|
| DualMind is a 1.7B parameter model that implements **dual-mental-modality reasoning** — a single model with two internal voices sharing the same weights, differentiated only by role tokens: |
|
|
| - **`<explore>`** — Unconstrained reasoning. Derivation, speculation, working through the problem freely. |
| - **`<examine>`** — Adversarial self-response. The model reads its own explore output and critiques it. Error detection, verification, refinement. |
| - **`<response>`** — Clean synthesis. The final answer distilled from the internal dialogue. |
|
|
| This is the multi-model collision array collapsed into a single architecture. The dialectical structure that produces novel insights from architectural diversity (demonstrated in our [five-architecture collision experiments](https://huggingface.co/reaperdoesntknow)) is recreated through role-conditioned generation on shared weights. |
|
|
| ## Architecture |
|
|
| | Parameter | Value | |
| |-----------|-------| |
| | Architecture | Qwen3ForCausalLM | |
| | Parameters | ~2.03B (1.7B effective) | |
| | Hidden Size | 2048 | |
| | Layers | 28 | |
| | Attention Heads | 16 (Q) / 8 (KV) — GQA | |
| | Context Length | 40,960 tokens | |
| | Precision | BF16 (trained on H100) | |
|
|
| ## Training |
|
|
| **Base model:** [Disctil-Qwen3-1.7B](https://huggingface.co/reaperdoesntknow/Disctil-Qwen3-1.7B) (DISC-refined uncensored Qwen3) |
|
|
| **Dataset:** [KK04/LogicInference_OA](https://huggingface.co/datasets/KK04/LogicInference_OA) — Logical inference problems transformed into the DualMind cognitive loop format. |
|
|
| **Training format:** Each CoT solution is restructured into the DualMind format: |
| - Derivation sentences → `<explore>` block (reasoning phase) |
| - Verification/checking sentences → `<examine>` block (self-critique phase) |
| - Final answer → `<response>` block (synthesis) |
|
|
| Sentence-level splitting uses trigger detection (check, verify, however, but wait, etc.) to find the natural transition from reasoning to verification, with 70/30 positional fallback. |
|
|
| **Hardware:** Colab H100, BF16 precision. 512 steps, lr 5e-6, SFT via TRL. |
|
|
| **Next iteration:** Currently training on [Crownelius/Opus-4.6-Reasoning-3300x](https://huggingface.co/datasets/Crownelius/Opus-4.6-Reasoning-3300x) — 2,160 Claude Opus 4.6 reasoning samples with pre-separated `thinking`/`solution` columns, eliminating the need for heuristic splitting. |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| model = AutoModelForCausalLM.from_pretrained( |
| "reaperdoesntknow/DualMind", |
| torch_dtype="auto", |
| device_map="auto" |
| ) |
| tokenizer = AutoTokenizer.from_pretrained("reaperdoesntknow/DualMind") |
| |
| # Start the explore block — the model completes the full loop |
| prompt = ( |
| "##USER:\n" |
| "Prove that the sum of two even numbers is always even.\n\n" |
| "<explore>\n" |
| ) |
| |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
| output = model.generate( |
| **inputs, |
| max_new_tokens=1024, |
| do_sample=True, |
| top_p=0.9, |
| temperature=0.6, |
| repetition_penalty=1.15, |
| ) |
| result = tokenizer.decode(output[0], skip_special_tokens=True) |
| print(result) |
| ``` |
|
|
| ### Expected Output Structure |
|
|
| ``` |
| <explore> |
| [The model works through the proof freely — definitions, algebraic manipulation, etc.] |
| </explore> |
| |
| <examine> |
| [The model critiques its own derivation — checks for gaps, verifies steps, catches errors] |
| </examine> |
| |
| <response> |
| [Clean final answer synthesized from the internal dialogue] |
| </response> |
| ``` |
|
|
| ## Why Dual Modality |
|
|
| Standard CoT prompting produces a single stream of reasoning. The model has one shot to get it right. DualMind gives the model a structural mechanism for self-correction: |
|
|
| 1. **Explore** is free to make mistakes, speculate, and try approaches that might not work |
| 2. **Examine** reads the explore output adversarially — it's looking for errors, not confirming correctness |
| 3. **Response** has the benefit of both perspectives |
|
|
| This mirrors what happens in multi-model collision arrays where different architectures produce genuinely different failure modes, and the collision between them surfaces structure that neither achieves alone. DualMind recreates this dynamic within a single set of weights through role conditioning. |
|
|
| ## Distillation Chain |
|
|
| ``` |
| Qwen3-1.7B (base) |
| → DiStil-Qwen3-1.7B-uncensored (uncensored SFT) |
| → Disctil-Qwen3-1.7B (DISC refinement) |
| → DualMind (DualMind SFT on Opus 4.6 reasoning data) ← you are here |
| ``` |
|
|
|
|
| ## Mathematical Foundations: Discrepancy Calculus (DISC) |
|
|
| DualMind's dual-cognition architecture connects to Discrepancy Calculus through **Continuous Thought Dynamics** (Ch. 19 of the DISC monograph) — which models inference as a discrepancy-guided PDE where the explore→examine→respond cycle corresponds to a controlled trajectory through cognitive phase space. |
|
|
| The discrepancy operator: |
|
|
| $$Df(x) = \lim_{\varepsilon \downarrow 0} \frac{1}{\varepsilon} \int_x^{x+\varepsilon} \frac{|f(t) - f(x)|}{|t - x|}\, dt$$ |
|
|
| quantifies the mismatch between what the model generates (integration) and what it should generate (differentiation). The `<explore>` phase increases discrepancy energy freely; `<examine>` applies the Adaptive Discrepancy Derivative (ADD, Ch. 14) to detect drift; `<response>` minimizes residual discrepancy into a clean output. The three phases implement the BV decomposition operationally: smooth reasoning, jump corrections at error boundaries, and Cantor-type refinement of subtle drift. |
|
|
| Full theory: *"On the Formal Analysis of Discrepancy Calculus"* (Colca, 2026; Convergent Intelligence LLC: Research Division). |
|
|
| ## Related Models |
|
|
| | Model | Description | Downloads | |
| |-------|-------------|-----------| |
| | [TopologicalQwen](https://huggingface.co/reaperdoesntknow/TopologicalQwen) | TKD + DualMind on physics CoT | 622 | |
| | [Disctil-Qwen3-1.7B](https://huggingface.co/reaperdoesntknow/Disctil-Qwen3-1.7B) | Parent model (DISC-refined) | 286 | |
| | [Qwen3-1.7B-Thinking-Distil](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Thinking-Distil) | TKD with Thinking teacher | 687 | |
|
|
| **[DualMind Collection](https://huggingface.co/collections/reaperdoesntknow/dualmind)** — Dual-cognition model series |
|
|
| **[DistilQwen Collection](https://huggingface.co/collections/reaperdoesntknow/distilqwen-69bf40ec669117e3f069ef1c)** — Full proof-weighted distillation series |
|
|
| Full methodology: [Structure Over Scale (DOI: 10.57967/hf/8165)](https://doi.org/10.57967/hf/8165) |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{colca2026dualmind, |
| title={DualMind: Dual-Mental-Modality Reasoning via Role-Conditioned Self-Critique}, |
| author={Colca, Roy S.}, |
| year={2026}, |
| publisher={HuggingFace}, |
| url={https://huggingface.co/reaperdoesntknow/DualMind}, |
| note={Convergent Intelligence LLC: Research Division} |
| } |
| ``` |
|
|
| --- |
|
|
| *Convergent Intelligence LLC: Research Division* |
| *"Where classical analysis fails to see, we begin."* |
| <!-- cix-keeper-ts:2026-03-30T12:05:00Z --> |
| <!-- card-refresh: 2026-03-30 --> |