dexmac
/

progressive-cognitive-dream-lora

@@ -5,6 +5,7 @@ library_name: peft
 base_model: Qwen/Qwen2.5-1.5B
 tags:
   - lora
   - cognitive-architecture
   - progressive-learning
   - dream-pruning
@@ -18,66 +19,94 @@ datasets:
 pipeline_tag: text-generation
 ---
-# Progressive Cognitive Architecture — Dream-LoRA (Qwen 2.5 1.5B)
-**The model that develops mathematical intuition through a 4-phase cognitive curriculum + SVD Dream Pruning.**
-## 🧠 What is this?
-This is a LoRA adapter trained with the **Progressive Cognitive Architecture**, a bio-inspired training methodology that teaches LLMs to develop mathematical intuition rather than memorize answers. The training follows 4 cognitive phases:
-1. **Foundations** — Learn exact arithmetic (2,000 examples)
-2. **Consolidation** — SVD Dream Pruning compresses exact circuits into intuition (rank 16→8), then fine-tune on approximation (1,500 examples)
-3. **Delegation** — Learn when to delegate to a calculator tool vs compute internally (1,500 examples)
-4. **Orchestration** — Full pipeline: intuition → routing → tool → validation (1,000 examples)
-## 🔬 Dream Pruning (SVD Low-Rank Factorization)
-Instead of zeroing out small weights (magnitude pruning), Dream Pruning uses **SVD decomposition** to reduce the effective rank of LoRA matrices from 16 to 8, preserving the principal directions (the "logical connections") while discarding noise. Think of it as the model "sleeping" and consolidating its memories.
-## 📊 Results (Ablation Study: 100 tests × 3 seeds)
-| Metric | Dream-LoRA | Flat-LoRA | Base |
-|--------|-----------|-----------|------|
-| Exact Accuracy | 58.6% ± 2.9 | 60.6% ± 3.8 | 18.2% ± 2.9 |
-| Number Sense | 60.0% ± 0.8 | 0.0% | 57.0% ± 1.4 |
-| Metacognition (delegation) | **100.0%** | 0.0% | 84.9% |
-| Sensible Errors | 81.3% | — | — |
-**Key insight**: Flat-LoRA wins on raw accuracy but *destroys* number sense and metacognition. Dream-LoRA preserves both while achieving comparable accuracy.
-## 🚀 Quick Start
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 from peft import PeftModel
-base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B", device_map="auto")
 tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-1.5B")
-# Load LoRA adapter (note: adapters are in lora_adapters/ subfolder)
-model = PeftModel.from_pretrained(base_model, "dexmac/progressive-cognitive-dream-lora", subfolder="lora_adapters")
-# Test it
-inputs = tokenizer("Calcola: 347 + 891 =", return_tensors="pt").to(model.device)
-outputs = model.generate(**inputs, max_new_tokens=20)
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
-## ⚙️ Training Details
-- **Base model**: Qwen/Qwen2.5-1.5B (1.5B parameters, frozen)
-- **LoRA config**: rank=16, alpha=32, dropout=0.05, targets=q_proj, k_proj, v_proj, o_proj
-- **Dream Pruning**: SVD rank reduction 16→8 via QR+SVD decomposition
-- **Training data**: 6,000 synthetic math examples (Italian prompts)
-- **Hardware**: NVIDIA T4 (16GB VRAM) on Hugging Face Spaces
-- **Training time**: ~45 minutes
-## 📄 Paper & Code
-- **Article**: [What if AI Models Learned Like Humans Do?](https://medium.com/towards-artificial-intelligence/what-if-ai-models-learned-like-humans-do-c69c19f29d0c)
-- **GitHub**: [dexmac221/progressive-cognitive](https://github.com/dexmac221/progressive-cognitive)
-## 📜 License
 Apache 2.0

 base_model: Qwen/Qwen2.5-1.5B
 tags:
   - lora
+  - peft
   - cognitive-architecture
   - progressive-learning
   - dream-pruning
 pipeline_tag: text-generation
 ---
+# Architettura Cognitiva Progressiva — Dream-LoRA con SVD Pruning (Italiano)
+**Modello principale italiano** — Qwen2.5-1.5B addestrato con architettura cognitiva progressiva a 4 fasi + **SVD Dream Pruning** (rank 16→8).
+## 📊 Risultati
+| Metrica | Dream-LoRA (questo) | Progressive-LoRA | Flat-LoRA |
+|---------|---------------------|------------------|-----------|
+| Accuratezza Esatta | **58.6% ± 2.9** | 37.0% ± 0.5 | 60.6% |
+| Number Sense | **60.0% ± 0.8** | 57.7% ± 0.5 | 0.0% |
+| Metacognizione | **100.0%** | 98.5% | 0.0% |
+Il passaggio da magnitude pruning a SVD Dream Pruning ha migliorato significativamente l'accuratezza esatta (+21.6pp) preservando number sense e metacognizione.
+## 🧠 Progressive Cognitive Architecture
+A bio-inspired 4-phase training methodology:
+| Phase | Name | What happens |
+|-------|------|-------------|
+| 1 | **Foundation** | Learn exact arithmetic via LoRA fine-tuning |
+| 2 | **Consolidation** | SVD Dream Pruning (rank 16→8) compresses knowledge into intuition |
+| 3 | **Delegation** | Learn complexity-aware routing: compute internally vs. delegate to tool |
+| 4 | **Orchestration** | Full pipeline: intuit → route → tool → validate |
+**Guiding Principle:** *Knowledge doesn't disappear — it collapses into attractors. Intuition is the compressed residue of experience.*
+## 🌙 Dream Pruning (Fattorizzazione SVD a Basso Rango)
+Invece di azzerare i pesi piccoli, il Dream Pruning usa la **decomposizione SVD** per ridurre il rango effettivo delle matrici LoRA da 16 a 8. Preserva le direzioni principali ("connessioni logiche") scartando il rumore — analogo al consolidamento della memoria durante il sonno.
+## 🔧 Configurazione
+| Parametro | Valore |
+|-----------|--------|
+| Modello Base | Qwen/Qwen2.5-1.5B |
+| LoRA Rank | 16 (→ 8 dopo SVD) |
+| LoRA Alpha | 32 |
+| Target LoRA | q_proj, k_proj, v_proj, o_proj |
+| Tipo Pruning | SVD Low-Rank Factorization |
+| Lingua Dati | Italiano |
+## 🚀 Uso Rapido
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 from peft import PeftModel
+base_model = AutoModelForCausalLM.from_pretrained(
+    "Qwen/Qwen2.5-1.5B", device_map="auto", torch_dtype="auto"
+)
 tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-1.5B")
+model = PeftModel.from_pretrained(
+    base_model,
+    "dexmac/progressive-cognitive-dream-lora",
+    subfolder="lora_adapters"
+)
+messages = [{"role": "user", "content": "Risolvi: 342 * 67"}]
+text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(text, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.1)
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
+## 🔗 Modelli Correlati
+- [Progressive-LoRA (IT)](https://huggingface.co/dexmac/progressive-cognitive-lora) — Primo prototipo con magnitude pruning
+- [Flat-LoRA (IT)](https://huggingface.co/dexmac/progressive-cognitive-baseline-lora) — Controllo senza fasi
+- [**1.5B Dream (EN)**](https://huggingface.co/dexmac/progressive-cognitive-dream-lora-en) — Miglior modello (inglese, composite 87.6)
+- [GitHub](https://github.com/dexmac221/progressive-cognitive) — Codice sorgente completo
+## 📝 Citation
+```bibtex
+@software{progressive_cognitive_2026,
+  author = {Dex Mac},
+  title = {Progressive Cognitive Architecture for LLMs},
+  year = {2026},
+  url = {https://github.com/dexmac221/progressive-cognitive},
+  version = {1.0.0}
+}
+```
+## 📄 License
 Apache 2.0