dexmac commited on
Commit
dee857b
Β·
verified Β·
1 Parent(s): fd2e95c

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +65 -36
README.md CHANGED
@@ -5,6 +5,7 @@ library_name: peft
5
  base_model: Qwen/Qwen2.5-1.5B
6
  tags:
7
  - lora
 
8
  - cognitive-architecture
9
  - progressive-learning
10
  - dream-pruning
@@ -18,66 +19,94 @@ datasets:
18
  pipeline_tag: text-generation
19
  ---
20
 
21
- # Progressive Cognitive Architecture β€” Dream-LoRA (Qwen 2.5 1.5B)
22
 
23
- **The model that develops mathematical intuition through a 4-phase cognitive curriculum + SVD Dream Pruning.**
24
 
25
- ## 🧠 What is this?
26
 
27
- This is a LoRA adapter trained with the **Progressive Cognitive Architecture**, a bio-inspired training methodology that teaches LLMs to develop mathematical intuition rather than memorize answers. The training follows 4 cognitive phases:
 
 
 
 
28
 
29
- 1. **Foundations** β€” Learn exact arithmetic (2,000 examples)
30
- 2. **Consolidation** β€” SVD Dream Pruning compresses exact circuits into intuition (rank 16β†’8), then fine-tune on approximation (1,500 examples)
31
- 3. **Delegation** β€” Learn when to delegate to a calculator tool vs compute internally (1,500 examples)
32
- 4. **Orchestration** β€” Full pipeline: intuition β†’ routing β†’ tool β†’ validation (1,000 examples)
33
 
34
- ## πŸ”¬ Dream Pruning (SVD Low-Rank Factorization)
35
 
36
- Instead of zeroing out small weights (magnitude pruning), Dream Pruning uses **SVD decomposition** to reduce the effective rank of LoRA matrices from 16 to 8, preserving the principal directions (the "logical connections") while discarding noise. Think of it as the model "sleeping" and consolidating its memories.
37
 
38
- ## πŸ“Š Results (Ablation Study: 100 tests Γ— 3 seeds)
 
 
 
 
 
39
 
40
- | Metric | Dream-LoRA | Flat-LoRA | Base |
41
- |--------|-----------|-----------|------|
42
- | Exact Accuracy | 58.6% Β± 2.9 | 60.6% Β± 3.8 | 18.2% Β± 2.9 |
43
- | Number Sense | 60.0% Β± 0.8 | 0.0% | 57.0% Β± 1.4 |
44
- | Metacognition (delegation) | **100.0%** | 0.0% | 84.9% |
45
- | Sensible Errors | 81.3% | β€” | β€” |
46
 
47
- **Key insight**: Flat-LoRA wins on raw accuracy but *destroys* number sense and metacognition. Dream-LoRA preserves both while achieving comparable accuracy.
48
 
49
- ## πŸš€ Quick Start
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
 
51
  ```python
52
  from transformers import AutoModelForCausalLM, AutoTokenizer
53
  from peft import PeftModel
54
 
55
- base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B", device_map="auto")
 
 
56
  tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-1.5B")
57
 
58
- # Load LoRA adapter (note: adapters are in lora_adapters/ subfolder)
59
- model = PeftModel.from_pretrained(base_model, "dexmac/progressive-cognitive-dream-lora", subfolder="lora_adapters")
 
 
 
60
 
61
- # Test it
62
- inputs = tokenizer("Calcola: 347 + 891 =", return_tensors="pt").to(model.device)
63
- outputs = model.generate(**inputs, max_new_tokens=20)
 
64
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
65
  ```
66
 
67
- ## βš™οΈ Training Details
68
 
69
- - **Base model**: Qwen/Qwen2.5-1.5B (1.5B parameters, frozen)
70
- - **LoRA config**: rank=16, alpha=32, dropout=0.05, targets=q_proj, k_proj, v_proj, o_proj
71
- - **Dream Pruning**: SVD rank reduction 16β†’8 via QR+SVD decomposition
72
- - **Training data**: 6,000 synthetic math examples (Italian prompts)
73
- - **Hardware**: NVIDIA T4 (16GB VRAM) on Hugging Face Spaces
74
- - **Training time**: ~45 minutes
75
 
76
- ## πŸ“„ Paper & Code
 
 
 
 
 
 
 
 
 
 
77
 
78
- - **Article**: [What if AI Models Learned Like Humans Do?](https://medium.com/towards-artificial-intelligence/what-if-ai-models-learned-like-humans-do-c69c19f29d0c)
79
- - **GitHub**: [dexmac221/progressive-cognitive](https://github.com/dexmac221/progressive-cognitive)
80
 
81
- ## πŸ“œ License
82
 
83
  Apache 2.0
 
 
5
  base_model: Qwen/Qwen2.5-1.5B
6
  tags:
7
  - lora
8
+ - peft
9
  - cognitive-architecture
10
  - progressive-learning
11
  - dream-pruning
 
19
  pipeline_tag: text-generation
20
  ---
21
 
22
+ # Architettura Cognitiva Progressiva β€” Dream-LoRA con SVD Pruning (Italiano)
23
 
24
+ **Modello principale italiano** β€” Qwen2.5-1.5B addestrato con architettura cognitiva progressiva a 4 fasi + **SVD Dream Pruning** (rank 16β†’8).
25
 
26
+ ## πŸ“Š Risultati
27
 
28
+ | Metrica | Dream-LoRA (questo) | Progressive-LoRA | Flat-LoRA |
29
+ |---------|---------------------|------------------|-----------|
30
+ | Accuratezza Esatta | **58.6% Β± 2.9** | 37.0% Β± 0.5 | 60.6% |
31
+ | Number Sense | **60.0% Β± 0.8** | 57.7% Β± 0.5 | 0.0% |
32
+ | Metacognizione | **100.0%** | 98.5% | 0.0% |
33
 
34
+ Il passaggio da magnitude pruning a SVD Dream Pruning ha migliorato significativamente l'accuratezza esatta (+21.6pp) preservando number sense e metacognizione.
 
 
 
35
 
36
+ ## 🧠 Progressive Cognitive Architecture
37
 
38
+ A bio-inspired 4-phase training methodology:
39
 
40
+ | Phase | Name | What happens |
41
+ |-------|------|-------------|
42
+ | 1 | **Foundation** | Learn exact arithmetic via LoRA fine-tuning |
43
+ | 2 | **Consolidation** | SVD Dream Pruning (rank 16β†’8) compresses knowledge into intuition |
44
+ | 3 | **Delegation** | Learn complexity-aware routing: compute internally vs. delegate to tool |
45
+ | 4 | **Orchestration** | Full pipeline: intuit β†’ route β†’ tool β†’ validate |
46
 
47
+ **Guiding Principle:** *Knowledge doesn't disappear β€” it collapses into attractors. Intuition is the compressed residue of experience.*
 
 
 
 
 
48
 
 
49
 
50
+ ## πŸŒ™ Dream Pruning (Fattorizzazione SVD a Basso Rango)
51
+
52
+ Invece di azzerare i pesi piccoli, il Dream Pruning usa la **decomposizione SVD** per ridurre il rango effettivo delle matrici LoRA da 16 a 8. Preserva le direzioni principali ("connessioni logiche") scartando il rumore β€” analogo al consolidamento della memoria durante il sonno.
53
+
54
+ ## πŸ”§ Configurazione
55
+
56
+ | Parametro | Valore |
57
+ |-----------|--------|
58
+ | Modello Base | Qwen/Qwen2.5-1.5B |
59
+ | LoRA Rank | 16 (β†’ 8 dopo SVD) |
60
+ | LoRA Alpha | 32 |
61
+ | Target LoRA | q_proj, k_proj, v_proj, o_proj |
62
+ | Tipo Pruning | SVD Low-Rank Factorization |
63
+ | Lingua Dati | Italiano |
64
+
65
+ ## πŸš€ Uso Rapido
66
 
67
  ```python
68
  from transformers import AutoModelForCausalLM, AutoTokenizer
69
  from peft import PeftModel
70
 
71
+ base_model = AutoModelForCausalLM.from_pretrained(
72
+ "Qwen/Qwen2.5-1.5B", device_map="auto", torch_dtype="auto"
73
+ )
74
  tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-1.5B")
75
 
76
+ model = PeftModel.from_pretrained(
77
+ base_model,
78
+ "dexmac/progressive-cognitive-dream-lora",
79
+ subfolder="lora_adapters"
80
+ )
81
 
82
+ messages = [{"role": "user", "content": "Risolvi: 342 * 67"}]
83
+ text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
84
+ inputs = tokenizer(text, return_tensors="pt").to(model.device)
85
+ outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.1)
86
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
87
  ```
88
 
89
+ ## πŸ”— Modelli Correlati
90
 
91
+ - [Progressive-LoRA (IT)](https://huggingface.co/dexmac/progressive-cognitive-lora) β€” Primo prototipo con magnitude pruning
92
+ - [Flat-LoRA (IT)](https://huggingface.co/dexmac/progressive-cognitive-baseline-lora) β€” Controllo senza fasi
93
+ - [**1.5B Dream (EN)**](https://huggingface.co/dexmac/progressive-cognitive-dream-lora-en) β€” Miglior modello (inglese, composite 87.6)
94
+ - [GitHub](https://github.com/dexmac221/progressive-cognitive) β€” Codice sorgente completo
 
 
95
 
96
+ ## πŸ“ Citation
97
+
98
+ ```bibtex
99
+ @software{progressive_cognitive_2026,
100
+ author = {Dex Mac},
101
+ title = {Progressive Cognitive Architecture for LLMs},
102
+ year = {2026},
103
+ url = {https://github.com/dexmac221/progressive-cognitive},
104
+ version = {1.0.0}
105
+ }
106
+ ```
107
 
 
 
108
 
109
+ ## πŸ“„ License
110
 
111
  Apache 2.0
112
+