dexmac commited on
Commit
6ec753a
Β·
verified Β·
1 Parent(s): b6c9429

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +163 -0
README.md ADDED
@@ -0,0 +1,163 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: apache-2.0
4
+ library_name: peft
5
+ base_model: Qwen/Qwen2.5-1.5B
6
+ tags:
7
+ - lora
8
+ - peft
9
+ - cognitive-architecture
10
+ - progressive-learning
11
+ - dream-pruning
12
+ - svd
13
+ - math
14
+ - arithmetic
15
+ - number-sense
16
+ - tool-use
17
+ - metacognition
18
+ datasets:
19
+ - custom
20
+ pipeline_tag: text-generation
21
+ model-index:
22
+ - name: progressive-cognitive-dream-lora-en
23
+ results:
24
+ - task:
25
+ type: text-generation
26
+ name: Cognitive Arithmetic
27
+ metrics:
28
+ - type: exact_accuracy
29
+ value: 69.4
30
+ name: Exact Accuracy (%)
31
+ - type: composite_score
32
+ value: 87.6
33
+ name: Composite Cognitive Score
34
+ ---
35
+
36
+ # Progressive Cognitive Architecture β€” 1.5B Dream LoRA (English)
37
+
38
+ **πŸ† Best overall model (composite 87.6/100)** β€” Qwen2.5-1.5B fine-tuned with 4-phase progressive training + SVD Dream Pruning.
39
+
40
+ ## ✨ Highlights
41
+
42
+ | Metric | Score |
43
+ |--------|-------|
44
+ | **Composite Score** | **87.6** |
45
+ | Exact Accuracy | 69.4% Β± 6.4 |
46
+ | Adversarial Robustness | 84.0% Β± 8.0 |
47
+ | Delegation Accuracy | 100.0% Β± 0.0 |
48
+ | Delegation Rate | 100.0% Β± 0.0 |
49
+ | Magnitude Sense (OoMΒ±1) | 100.0% Β± 0.0 |
50
+ | Catastrophic Errors | **0.0% Β± 0.0** |
51
+
52
+ > Results: mean Β± std over 3 seeds (42, 43, 44), 50 samples Γ— 5 dimensions per seed.
53
+
54
+ ## πŸ”‘ Key Findings
55
+
56
+ - **Outperforms all 3B variants** despite having half the parameters
57
+ - **Zero catastrophic errors** β€” never produces absurd results
58
+ - **100% delegation** β€” always routes complex operations to tools
59
+ - Dream pruning acts as **cognitive regularization** for capacity-constrained models
60
+
61
+ ## 🧠 Progressive Cognitive Architecture
62
+
63
+ A bio-inspired 4-phase training methodology:
64
+
65
+ | Phase | Name | What happens |
66
+ |-------|------|-------------|
67
+ | 1 | **Foundation** | Learn exact arithmetic via LoRA fine-tuning |
68
+ | 2 | **Consolidation** | SVD Dream Pruning (rank 16β†’8) compresses knowledge into intuition |
69
+ | 3 | **Delegation** | Learn complexity-aware routing: compute internally vs. delegate to tool |
70
+ | 4 | **Orchestration** | Full pipeline: intuit β†’ route β†’ tool β†’ validate |
71
+
72
+ **Guiding Principle:** *Knowledge doesn't disappear β€” it collapses into attractors. Intuition is the compressed residue of experience.*
73
+
74
+
75
+ ## πŸŒ™ Dream Pruning (SVD Low-Rank Factorization)
76
+
77
+ Instead of zeroing out small weights (magnitude pruning), Dream Pruning uses **SVD decomposition** to reduce the effective rank of LoRA matrices from 16 to 8. This preserves the principal directions ("logical connections") while discarding noise β€” analogous to memory consolidation during sleep.
78
+
79
+ ```
80
+ W = U·Σ·V^T β†’ W' = U[:,:k]Β·Ξ£[:k,:k]Β·V^T[:k,:] (k=8)
81
+ ```
82
+
83
+
84
+ ## πŸ”§ Training Configuration
85
+
86
+ | Parameter | Value |
87
+ |-----------|-------|
88
+ | Base Model | Qwen/Qwen2.5-1.5B |
89
+ | LoRA Rank | 16 (β†’ 8 after SVD) |
90
+ | LoRA Alpha | 32 |
91
+ | LoRA Targets | q_proj, k_proj, v_proj, o_proj |
92
+ | Dropout | 0.05 |
93
+ | Training Data | ~6,000 English arithmetic examples |
94
+ | Hardware | NVIDIA T4 16GB |
95
+
96
+ ## πŸš€ Quick Start
97
+
98
+ ```python
99
+ from transformers import AutoModelForCausalLM, AutoTokenizer
100
+ from peft import PeftModel
101
+
102
+ base_model = AutoModelForCausalLM.from_pretrained(
103
+ "Qwen/Qwen2.5-1.5B", device_map="auto", torch_dtype="auto"
104
+ )
105
+ tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-1.5B")
106
+
107
+ # Note: adapters are in the lora_adapters/ subfolder
108
+ model = PeftModel.from_pretrained(
109
+ base_model,
110
+ "dexmac/progressive-cognitive-dream-lora-en",
111
+ subfolder="lora_adapters"
112
+ )
113
+
114
+ messages = [{"role": "user", "content": "Solve: 342 * 67"}]
115
+ text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
116
+ inputs = tokenizer(text, return_tensors="pt").to(model.device)
117
+ outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.1)
118
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
119
+ ```
120
+
121
+ **Expected output pattern:**
122
+ ```
123
+ Step 1 - Intuition: in the order of tens of thousands
124
+ Step 2 - Routing: DELEGATE (medium complexity)
125
+ Step 3 - Tool: 22914
126
+ Step 4 - Validation: result 22914 consistent with estimate β†’ VALID
127
+ ```
128
+
129
+ ## πŸ“Š Full Comparison
130
+
131
+ | Model | Composite | Exact | Adversarial | Delegation | Magnitude | Safety |
132
+ |---|---|---|---|---|---|---|
133
+ | **1.5B Dream (this)** | **87.6** | 69% | 84% | 100% | 100% | 100% |
134
+ | 1.5B Flat | 79.2 | 57% | 81% | 79% | 100% | 100% |
135
+ | 3B Flat | 78.5 | 60% | 85% | 79% | 84% | 100% |
136
+ | 3B Dream | 66.0 | 56% | 34% | 93% | 100% | 59% |
137
+
138
+ ## πŸ”— Related Models
139
+
140
+ - [Flat LoRA (control)](https://huggingface.co/dexmac/progressive-cognitive-baseline-lora-en) β€” Same data, no phases, no pruning
141
+ - [3B Dream](https://huggingface.co/dexmac/progressive-cognitive-qwen3b-dream-lora) β€” Same architecture on Qwen2.5-3B
142
+ - [3B Flat](https://huggingface.co/dexmac/progressive-cognitive-qwen3b-baseline-lora) β€” 3B control
143
+ - [Italian Dream](https://huggingface.co/dexmac/progressive-cognitive-dream-lora) β€” Italian language variant
144
+ - [Results Dataset](https://huggingface.co/datasets/dexmac/progressive-cognitive-results) β€” Raw evaluation data
145
+ - [GitHub](https://github.com/dexmac221/progressive-cognitive) β€” Full source code and evaluation framework
146
+
147
+ ## πŸ“ Citation
148
+
149
+ ```bibtex
150
+ @software{progressive_cognitive_2026,
151
+ author = {Dex Mac},
152
+ title = {Progressive Cognitive Architecture for LLMs},
153
+ year = {2026},
154
+ url = {https://github.com/dexmac221/progressive-cognitive},
155
+ version = {1.0.0}
156
+ }
157
+ ```
158
+
159
+
160
+ ## πŸ“„ License
161
+
162
+ Apache 2.0
163
+