README.md · reaperdoesntknow/DualMind at main

DualMind / README.md

reaperdoesntknow

Add standard tags: convergentintel, edge, distillation, knowledge-distillation

bc028c9 verified about 3 hours ago

preview code

raw

history blame contribute delete

7.43 kB

	---
	license: apache-2.0
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- qwen3
	- sft
	- trl
	- dual-mind
	- reasoning
	- convergent-intelligence
	- explore-examine-response
	- convergentintel
	- edge
	- distillation
	- knowledge-distillation
	datasets:
	- zai-org/LongWriter-6k
	base_model:
	- reaperdoesntknow/DiStil-Qwen3-1.7B-uncensored
	---

	# DualMind

	Single Architecture, Dual Cognition — The Multi-Model Collision Array on Shared Weights

	Convergent Intelligence LLC: Research Division

	---

	## What This Is

	DualMind is a 1.7B parameter model that implements dual-mental-modality reasoning — a single model with two internal voices sharing the same weights, differentiated only by role tokens:

	- `<explore>` — Unconstrained reasoning. Derivation, speculation, working through the problem freely.
	- `<examine>` — Adversarial self-response. The model reads its own explore output and critiques it. Error detection, verification, refinement.
	- `<response>` — Clean synthesis. The final answer distilled from the internal dialogue.

	This is the multi-model collision array collapsed into a single architecture. The dialectical structure that produces novel insights from architectural diversity (demonstrated in our [five-architecture collision experiments](https://huggingface.co/reaperdoesntknow)) is recreated through role-conditioned generation on shared weights.

	## Architecture

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Architecture \| Qwen3ForCausalLM \|
	\| Parameters \| ~2.03B (1.7B effective) \|
	\| Hidden Size \| 2048 \|
	\| Layers \| 28 \|
	\| Attention Heads \| 16 (Q) / 8 (KV) — GQA \|
	\| Context Length \| 40,960 tokens \|
	\| Precision \| BF16 (trained on H100) \|

	## Training

	Base model: [Disctil-Qwen3-1.7B](https://huggingface.co/reaperdoesntknow/Disctil-Qwen3-1.7B) (DISC-refined uncensored Qwen3)

	Dataset: [KK04/LogicInference_OA](https://huggingface.co/datasets/KK04/LogicInference_OA) — Logical inference problems transformed into the DualMind cognitive loop format.

	Training format: Each CoT solution is restructured into the DualMind format:
	- Derivation sentences → `<explore>` block (reasoning phase)
	- Verification/checking sentences → `<examine>` block (self-critique phase)
	- Final answer → `<response>` block (synthesis)

	Sentence-level splitting uses trigger detection (check, verify, however, but wait, etc.) to find the natural transition from reasoning to verification, with 70/30 positional fallback.

	Hardware: Colab H100, BF16 precision. 512 steps, lr 5e-6, SFT via TRL.

	Next iteration: Currently training on [Crownelius/Opus-4.6-Reasoning-3300x](https://huggingface.co/datasets/Crownelius/Opus-4.6-Reasoning-3300x) — 2,160 Claude Opus 4.6 reasoning samples with pre-separated `thinking`/`solution` columns, eliminating the need for heuristic splitting.

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained(
	"reaperdoesntknow/DualMind",
	torch_dtype="auto",
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained("reaperdoesntknow/DualMind")

	# Start the explore block — the model completes the full loop
	prompt = (
	"##USER:\n"
	"Prove that the sum of two even numbers is always even.\n\n"
	"<explore>\n"
	)

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	output = model.generate(
	**inputs,
	max_new_tokens=1024,
	do_sample=True,
	top_p=0.9,
	temperature=0.6,
	repetition_penalty=1.15,
	)
	result = tokenizer.decode(output[0], skip_special_tokens=True)
	print(result)
	```

	### Expected Output Structure

	```
	<explore>
	[The model works through the proof freely — definitions, algebraic manipulation, etc.]
	</explore>

	<examine>
	[The model critiques its own derivation — checks for gaps, verifies steps, catches errors]
	</examine>

	<response>
	[Clean final answer synthesized from the internal dialogue]
	</response>
	```

	## Why Dual Modality

	Standard CoT prompting produces a single stream of reasoning. The model has one shot to get it right. DualMind gives the model a structural mechanism for self-correction:

	1. Explore is free to make mistakes, speculate, and try approaches that might not work
	2. Examine reads the explore output adversarially — it's looking for errors, not confirming correctness
	3. Response has the benefit of both perspectives

	This mirrors what happens in multi-model collision arrays where different architectures produce genuinely different failure modes, and the collision between them surfaces structure that neither achieves alone. DualMind recreates this dynamic within a single set of weights through role conditioning.

	## Distillation Chain

	```
	Qwen3-1.7B (base)
	→ DiStil-Qwen3-1.7B-uncensored (uncensored SFT)
	→ Disctil-Qwen3-1.7B (DISC refinement)
	→ DualMind (DualMind SFT on Opus 4.6 reasoning data) ← you are here
	```


	## Mathematical Foundations: Discrepancy Calculus (DISC)

	DualMind's dual-cognition architecture connects to Discrepancy Calculus through Continuous Thought Dynamics (Ch. 19 of the DISC monograph) — which models inference as a discrepancy-guided PDE where the explore→examine→respond cycle corresponds to a controlled trajectory through cognitive phase space.

	The discrepancy operator:

	$$Df(x) = \lim_{\varepsilon \downarrow 0} \frac{1}{\varepsilon} \int_x^{x+\varepsilon} \frac{\|f(t) - f(x)\|}{\|t - x\|}\, dt$$

	quantifies the mismatch between what the model generates (integration) and what it should generate (differentiation). The `<explore>` phase increases discrepancy energy freely; `<examine>` applies the Adaptive Discrepancy Derivative (ADD, Ch. 14) to detect drift; `<response>` minimizes residual discrepancy into a clean output. The three phases implement the BV decomposition operationally: smooth reasoning, jump corrections at error boundaries, and Cantor-type refinement of subtle drift.

	Full theory: "On the Formal Analysis of Discrepancy Calculus" (Colca, 2026; Convergent Intelligence LLC: Research Division).

	## Related Models

	\| Model \| Description \| Downloads \|
	\|-------\|-------------\|-----------\|
	\| [TopologicalQwen](https://huggingface.co/reaperdoesntknow/TopologicalQwen) \| TKD + DualMind on physics CoT \| 622 \|
	\| [Disctil-Qwen3-1.7B](https://huggingface.co/reaperdoesntknow/Disctil-Qwen3-1.7B) \| Parent model (DISC-refined) \| 286 \|
	\| [Qwen3-1.7B-Thinking-Distil](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Thinking-Distil) \| TKD with Thinking teacher \| 687 \|

	[DualMind Collection](https://huggingface.co/collections/reaperdoesntknow/dualmind) — Dual-cognition model series

	[DistilQwen Collection](https://huggingface.co/collections/reaperdoesntknow/distilqwen-69bf40ec669117e3f069ef1c) — Full proof-weighted distillation series

	Full methodology: [Structure Over Scale (DOI: 10.57967/hf/8165)](https://doi.org/10.57967/hf/8165)

	## Citation

	```bibtex
	@misc{colca2026dualmind,
	title={DualMind: Dual-Mental-Modality Reasoning via Role-Conditioned Self-Critique},
	author={Colca, Roy S.},
	year={2026},
	publisher={HuggingFace},
	url={https://huggingface.co/reaperdoesntknow/DualMind},
	note={Convergent Intelligence LLC: Research Division}
	}
	```

	---

	Convergent Intelligence LLC: Research Division
	"Where classical analysis fails to see, we begin."
	<!-- cix-keeper-ts:2026-03-30T12:05:00Z -->
	<!-- card-refresh: 2026-03-30 -->