Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,76 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
license: apache-2.0
|
| 4 |
+
library_name: peft
|
| 5 |
+
base_model: Qwen/Qwen2.5-1.5B
|
| 6 |
+
tags:
|
| 7 |
+
- lora
|
| 8 |
+
- peft
|
| 9 |
+
- logic
|
| 10 |
+
- math
|
| 11 |
+
- reasoning
|
| 12 |
+
- monolithic
|
| 13 |
+
- cognitive-architecture
|
| 14 |
+
datasets:
|
| 15 |
+
- custom
|
| 16 |
+
pipeline_tag: text-generation
|
| 17 |
+
---
|
| 18 |
+
|
| 19 |
+
# Progressive Cognitive Architecture - Monolithic Math+Logic LoRA (English)
|
| 20 |
+
|
| 21 |
+
Single-adapter baseline that mixes arithmetic and logic capabilities in one Qwen2.5-1.5B LoRA.
|
| 22 |
+
|
| 23 |
+
## Summary
|
| 24 |
+
|
| 25 |
+
This repository contains the monolithic comparison model used in the Socratic Routing study. Unlike the routed setup, this model keeps math and logic adaptation in one adapter rather than distributing them across specialized components.
|
| 26 |
+
|
| 27 |
+
## Observed Behavior
|
| 28 |
+
|
| 29 |
+
Across the two completed seeds currently available for the mixed Socratic benchmark, this monolithic model achieved:
|
| 30 |
+
|
| 31 |
+
- 2-seed overall mean: 56.9%
|
| 32 |
+
- 2-seed logic composite mean: 55.8%
|
| 33 |
+
- 2-seed math composite mean: 58.1%
|
| 34 |
+
|
| 35 |
+
This makes it a balanced baseline: clearly stronger than the raw 1.5B base model, but weaker than the specialist-routed setup on the strongest completed routed run.
|
| 36 |
+
|
| 37 |
+
## Intended Use
|
| 38 |
+
|
| 39 |
+
- compact mixed reasoning baseline
|
| 40 |
+
- comparison point against specialist and routed systems
|
| 41 |
+
- research on tradeoffs between monolithic and distributed adaptation
|
| 42 |
+
|
| 43 |
+
## Limitations
|
| 44 |
+
|
| 45 |
+
- does not match the math specialist on arithmetic-heavy tasks
|
| 46 |
+
- does not match the logic specialist on logic-focused tasks
|
| 47 |
+
- provides a balanced compromise rather than a best-in-class result on either subdomain
|
| 48 |
+
|
| 49 |
+
## Loading
|
| 50 |
+
|
| 51 |
+
```python
|
| 52 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 53 |
+
from peft import PeftModel
|
| 54 |
+
|
| 55 |
+
base_model = AutoModelForCausalLM.from_pretrained(
|
| 56 |
+
"Qwen/Qwen2.5-1.5B", device_map="auto", torch_dtype="auto"
|
| 57 |
+
)
|
| 58 |
+
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-1.5B")
|
| 59 |
+
|
| 60 |
+
model = PeftModel.from_pretrained(
|
| 61 |
+
base_model,
|
| 62 |
+
"dexmac/progressive-cognitive-logic-dream-lora-en",
|
| 63 |
+
subfolder="lora_adapters"
|
| 64 |
+
)
|
| 65 |
+
```
|
| 66 |
+
|
| 67 |
+
## Related Repositories
|
| 68 |
+
|
| 69 |
+
- Logic specialist: https://huggingface.co/dexmac/progressive-cognitive-logic-specialist-en
|
| 70 |
+
- Math specialist: https://huggingface.co/dexmac/progressive-cognitive-dream-lora-en
|
| 71 |
+
- Router model: https://huggingface.co/dexmac/progressive-cognitive-router-en
|
| 72 |
+
- Results dataset: https://huggingface.co/datasets/dexmac/progressive-cognitive-results
|
| 73 |
+
|
| 74 |
+
## License
|
| 75 |
+
|
| 76 |
+
Apache 2.0
|