Update README.md

Browse files

Files changed (1) hide show

README.md +113 -155

README.md CHANGED Viewed

@@ -1,201 +1,159 @@
 ---
 library_name: transformers
 tags:
-- trl
 - sft
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
+license: apache-2.0
 library_name: transformers
+pipeline_tag: text-generation
 tags:
+- qwen3
 - sft
+- trl
+- dual-mind
+- reasoning
+- convergent-intelligence
+- explore-examine-response
 ---
+# DualMind
+**Single Architecture, Dual Cognition — The Multi-Model Collision Array on Shared Weights**
+*Convergent Intelligence LLC: Research Division*
+---
+## What This Is
+DualMind is a 1.7B parameter model that implements **dual-mental-modality reasoning** — a single model with two internal voices sharing the same weights, differentiated only by role tokens:
+- **`<explore>`** — Unconstrained reasoning. Derivation, speculation, working through the problem freely.
+- **`<examine>`** — Adversarial self-response. The model reads its own explore output and critiques it. Error detection, verification, refinement.
+- **`<response>`** — Clean synthesis. The final answer distilled from the internal dialogue.
+This is the multi-model collision array collapsed into a single architecture. The dialectical structure that produces novel insights from architectural diversity (demonstrated in our [five-architecture collision experiments](https://huggingface.co/reaperdoesntknow)) is recreated through role-conditioned generation on shared weights.
+## Architecture
+| Parameter | Value |
+|-----------|-------|
+| Architecture | Qwen3ForCausalLM |
+| Parameters | ~2.03B (1.7B effective) |
+| Hidden Size | 2048 |
+| Layers | 28 |
+| Attention Heads | 16 (Q) / 8 (KV) — GQA |
+| Context Length | 40,960 tokens |
+| Precision | BF16 (trained on H100) |
+## Training
+**Base model:** [Disctil-Qwen3-1.7B](https://huggingface.co/reaperdoesntknow/Disctil-Qwen3-1.7B) (DISC-refined uncensored Qwen3)
+**Dataset:** [KK04/LogicInference_OA](https://huggingface.co/datasets/KK04/LogicInference_OA) — Logical inference problems transformed into the DualMind cognitive loop format.
+**Training format:** Each CoT solution is restructured into the DualMind format:
+- Derivation sentences → `<explore>` block (reasoning phase)
+- Verification/checking sentences → `<examine>` block (self-critique phase)
+- Final answer → `<response>` block (synthesis)
+Sentence-level splitting uses trigger detection (check, verify, however, but wait, etc.) to find the natural transition from reasoning to verification, with 70/30 positional fallback.
+**Hardware:** Colab H100, BF16 precision. 512 steps, lr 5e-6, SFT via TRL.
+**Next iteration:** Currently training on [Crownelius/Opus-4.6-Reasoning-3300x](https://huggingface.co/datasets/Crownelius/Opus-4.6-Reasoning-3300x) — 2,160 Claude Opus 4.6 reasoning samples with pre-separated `thinking`/`solution` columns, eliminating the need for heuristic splitting.
+## Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained(
+    "reaperdoesntknow/DualMind",
+    torch_dtype="auto",
+    device_map="auto"
+)
+tokenizer = AutoTokenizer.from_pretrained("reaperdoesntknow/DualMind")
+# Start the explore block — the model completes the full loop
+prompt = (
+    "##USER:\n"
+    "Prove that the sum of two even numbers is always even.\n\n"
+    "<explore>\n"
+)
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+output = model.generate(
+    **inputs,
+    max_new_tokens=1024,
+    do_sample=True,
+    top_p=0.9,
+    temperature=0.6,
+    repetition_penalty=1.15,
+)
+result = tokenizer.decode(output[0], skip_special_tokens=True)
+print(result)
+```
+### Expected Output Structure
+```
+<explore>
+[The model works through the proof freely — definitions, algebraic manipulation, etc.]
+</explore>
+<examine>
+[The model critiques its own derivation — checks for gaps, verifies steps, catches errors]
+</examine>
+<response>
+[Clean final answer synthesized from the internal dialogue]
+</response>
+```
+## Why Dual Modality
+Standard CoT prompting produces a single stream of reasoning. The model has one shot to get it right. DualMind gives the model a structural mechanism for self-correction:
+1. **Explore** is free to make mistakes, speculate, and try approaches that might not work
+2. **Examine** reads the explore output adversarially — it's looking for errors, not confirming correctness
+3. **Response** has the benefit of both perspectives
+This mirrors what happens in multi-model collision arrays where different architectures produce genuinely different failure modes, and the collision between them surfaces structure that neither achieves alone. DualMind recreates this dynamic within a single set of weights through role conditioning.
+## Distillation Chain
+```
+Qwen3-1.7B (base)
+  → DiStil-Qwen3-1.7B-uncensored (uncensored SFT)
+    → Disctil-Qwen3-1.7B (DISC refinement)
+      → DualMind (DualMind SFT on Opus 4.6 reasoning data) ← you are here
+```
+## Related Models
+| Model | Description | Downloads |
+|-------|-------------|-----------|
+| [TopologicalQwen](https://huggingface.co/reaperdoesntknow/TopologicalQwen) | TKD + DualMind on physics CoT | 622 |
+| [Disctil-Qwen3-1.7B](https://huggingface.co/reaperdoesntknow/Disctil-Qwen3-1.7B) | Parent model (DISC-refined) | 286 |
+| [Qwen3-1.7B-Thinking-Distil](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Thinking-Distil) | TKD with Thinking teacher | 687 |
+**[DualMind Collection](https://huggingface.co/collections/reaperdoesntknow/dualmind)** — Dual-cognition model series
+**[DistilQwen Collection](https://huggingface.co/collections/reaperdoesntknow/distilqwen-69bf40ec669117e3f069ef1c)** — Full proof-weighted distillation series
+Full methodology: [Structure Over Scale (DOI: 10.57967/hf/8165)](https://doi.org/10.57967/hf/8165)
+## Citation
+```bibtex
+@misc{colca2026dualmind,
+  title={DualMind: Dual-Mental-Modality Reasoning via Role-Conditioned Self-Critique},
+  author={Colca, Roy S.},
+  year={2026},
+  publisher={HuggingFace},
+  url={https://huggingface.co/reaperdoesntknow/DualMind},
+  note={Convergent Intelligence LLC: Research Division}
+}
+```
+---
+*Convergent Intelligence LLC: Research Division*
+*"Where classical analysis fails to see, we begin."*