Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +163 -0

README.md ADDED Viewed

	@@ -0,0 +1,163 @@

+---
+language: en
+license: apache-2.0
+library_name: peft
+base_model: Qwen/Qwen2.5-1.5B
+tags:
+  - lora
+  - peft
+  - cognitive-architecture
+  - progressive-learning
+  - dream-pruning
+  - svd
+  - math
+  - arithmetic
+  - number-sense
+  - tool-use
+  - metacognition
+datasets:
+  - custom
+pipeline_tag: text-generation
+model-index:
+  - name: progressive-cognitive-dream-lora-en
+    results:
+      - task:
+          type: text-generation
+          name: Cognitive Arithmetic
+        metrics:
+          - type: exact_accuracy
+            value: 69.4
+            name: Exact Accuracy (%)
+          - type: composite_score
+            value: 87.6
+            name: Composite Cognitive Score
+---
+# Progressive Cognitive Architecture — 1.5B Dream LoRA (English)
+**🏆 Best overall model (composite 87.6/100)** — Qwen2.5-1.5B fine-tuned with 4-phase progressive training + SVD Dream Pruning.
+## ✨ Highlights
+| Metric | Score |
+|--------|-------|
+| **Composite Score** | **87.6** |
+| Exact Accuracy | 69.4% ± 6.4 |
+| Adversarial Robustness | 84.0% ± 8.0 |
+| Delegation Accuracy | 100.0% ± 0.0 |
+| Delegation Rate | 100.0% ± 0.0 |
+| Magnitude Sense (OoM±1) | 100.0% ± 0.0 |
+| Catastrophic Errors | **0.0% ± 0.0** |
+> Results: mean ± std over 3 seeds (42, 43, 44), 50 samples × 5 dimensions per seed.
+## 🔑 Key Findings
+- **Outperforms all 3B variants** despite having half the parameters
+- **Zero catastrophic errors** — never produces absurd results
+- **100% delegation** — always routes complex operations to tools
+- Dream pruning acts as **cognitive regularization** for capacity-constrained models
+## 🧠 Progressive Cognitive Architecture
+A bio-inspired 4-phase training methodology:
+| Phase | Name | What happens |
+|-------|------|-------------|
+| 1 | **Foundation** | Learn exact arithmetic via LoRA fine-tuning |
+| 2 | **Consolidation** | SVD Dream Pruning (rank 16→8) compresses knowledge into intuition |
+| 3 | **Delegation** | Learn complexity-aware routing: compute internally vs. delegate to tool |
+| 4 | **Orchestration** | Full pipeline: intuit → route → tool → validate |
+**Guiding Principle:** *Knowledge doesn't disappear — it collapses into attractors. Intuition is the compressed residue of experience.*
+## 🌙 Dream Pruning (SVD Low-Rank Factorization)
+Instead of zeroing out small weights (magnitude pruning), Dream Pruning uses **SVD decomposition** to reduce the effective rank of LoRA matrices from 16 to 8. This preserves the principal directions ("logical connections") while discarding noise — analogous to memory consolidation during sleep.
+```
+W = U·Σ·V^T  →  W' = U[:,:k]·Σ[:k,:k]·V^T[:k,:]   (k=8)
+```
+## 🔧 Training Configuration
+| Parameter | Value |
+|-----------|-------|
+| Base Model | Qwen/Qwen2.5-1.5B |
+| LoRA Rank | 16 (→ 8 after SVD) |
+| LoRA Alpha | 32 |
+| LoRA Targets | q_proj, k_proj, v_proj, o_proj |
+| Dropout | 0.05 |
+| Training Data | ~6,000 English arithmetic examples |
+| Hardware | NVIDIA T4 16GB |
+## 🚀 Quick Start
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from peft import PeftModel
+base_model = AutoModelForCausalLM.from_pretrained(
+    "Qwen/Qwen2.5-1.5B", device_map="auto", torch_dtype="auto"
+)
+tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-1.5B")
+# Note: adapters are in the lora_adapters/ subfolder
+model = PeftModel.from_pretrained(
+    base_model,
+    "dexmac/progressive-cognitive-dream-lora-en",
+    subfolder="lora_adapters"
+)
+messages = [{"role": "user", "content": "Solve: 342 * 67"}]
+text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(text, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.1)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+**Expected output pattern:**
+```
+Step 1 - Intuition: in the order of tens of thousands
+Step 2 - Routing: DELEGATE (medium complexity)
+Step 3 - Tool: 22914
+Step 4 - Validation: result 22914 consistent with estimate → VALID
+```
+## 📊 Full Comparison
+| Model | Composite | Exact | Adversarial | Delegation | Magnitude | Safety |
+|---|---|---|---|---|---|---|
+| **1.5B Dream (this)** | **87.6** | 69% | 84% | 100% | 100% | 100% |
+| 1.5B Flat | 79.2 | 57% | 81% | 79% | 100% | 100% |
+| 3B Flat | 78.5 | 60% | 85% | 79% | 84% | 100% |
+| 3B Dream | 66.0 | 56% | 34% | 93% | 100% | 59% |
+## 🔗 Related Models
+- [Flat LoRA (control)](https://huggingface.co/dexmac/progressive-cognitive-baseline-lora-en) — Same data, no phases, no pruning
+- [3B Dream](https://huggingface.co/dexmac/progressive-cognitive-qwen3b-dream-lora) — Same architecture on Qwen2.5-3B
+- [3B Flat](https://huggingface.co/dexmac/progressive-cognitive-qwen3b-baseline-lora) — 3B control
+- [Italian Dream](https://huggingface.co/dexmac/progressive-cognitive-dream-lora) — Italian language variant
+- [Results Dataset](https://huggingface.co/datasets/dexmac/progressive-cognitive-results) — Raw evaluation data
+- [GitHub](https://github.com/dexmac221/progressive-cognitive) — Full source code and evaluation framework
+## 📝 Citation
+```bibtex
+@software{progressive_cognitive_2026,
+  author = {Dex Mac},
+  title = {Progressive Cognitive Architecture for LLMs},
+  year = {2026},
+  url = {https://github.com/dexmac221/progressive-cognitive},
+  version = {1.0.0}
+}
+```
+## 📄 License
+Apache 2.0