Progressive Cognitive Architecture β 3B Flat LoRA (English, Control)
Control model β Qwen2.5-3B fine-tuned with all training data in a single pass (no phases, no pruning). Serves as the 3B baseline for evaluating progressive training.
π Results
| Metric | Score |
|---|---|
| Composite Score | 78.5 |
| Exact Accuracy | 60.4% Β± 7.5 |
| Adversarial Robustness | 84.7% Β± 1.2 |
| Delegation Accuracy | 100.0% Β± 0.0 |
| Delegation Rate | 58.7% Β± 4.6 |
| Magnitude Sense (OoMΒ±1) | 84.0% Β± 4.0 |
| Catastrophic Errors | 0.0% Β± 0.0 |
Results: mean Β± std over 3 seeds (42, 43, 44), 50 samples Γ 5 dimensions per seed.
βοΈ Comparison: Flat vs Dream at 3B
| Metric | 3B Flat (this) | 3B Dream | Delta |
|---|---|---|---|
| Composite | 78.5 | 66.0 | -12.5 |
| Adversarial | 84.7% | 34.0% | -50.7pp |
| Catastrophic | 0.0% | 41.3% | +41.3pp |
At 3B scale, flat training outperforms progressive Dream training β the inverse of the 1.5B result. This supports the hypothesis that SVD compression (rank 16β8) creates adapters too weak relative to the larger base model's weight space.
π§ Training Configuration
| Parameter | Value |
|---|---|
| Base Model | Qwen/Qwen2.5-3B |
| LoRA Rank | 16 |
| LoRA Alpha | 32 |
| LoRA Targets | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Dropout | 0.05 |
| Training Data | ~6,000 English arithmetic examples (all mixed in one pass) |
| Hardware | NVIDIA T4 16GB |
π Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2.5-3B", device_map="auto", torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-3B")
model = PeftModel.from_pretrained(
base_model,
"dexmac/progressive-cognitive-qwen3b-baseline-lora"
)
messages = [{"role": "user", "content": "Calculate: 342 * 67"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.1)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
π Related Models
- 1.5B Dream LoRA β Best overall model
- 3B Dream LoRA β Progressive training on 3B
- 1.5B Flat LoRA β 1.5B control
- Results Dataset β Raw evaluation data
- GitHub β Full source code
π Citation
@software{progressive_cognitive_2026,
author = {Dex Mac},
title = {Progressive Cognitive Architecture for LLMs},
year = {2026},
url = {https://github.com/dexmac221/progressive-cognitive},
version = {1.0.0}
}
π License
Apache 2.0
- Downloads last month
- 27
Model tree for dexmac/progressive-cognitive-qwen3b-baseline-lora
Base model
Qwen/Qwen2.5-3BEvaluation results
- Exact Accuracy (%)self-reported60.400
- Composite Cognitive Scoreself-reported78.500