dexmac
/

progressive-cognitive-baseline-lora-en

Text Generation

Eval Results (legacy)

Model card Files Files and versions

progressive-cognitive-baseline-lora-en / README.md

dexmac's picture

Upload README.md with huggingface_hub

e849b9f verified 2 days ago

|

history blame contribute delete

3.66 kB

	---
	language: en
	license: apache-2.0
	library_name: peft
	base_model: Qwen/Qwen2.5-1.5B
	tags:
	- lora
	- peft
	- baseline
	- flat-training
	- math
	- arithmetic
	- control-group
	datasets:
	- custom
	pipeline_tag: text-generation
	model-index:
	- name: progressive-cognitive-baseline-lora-en
	results:
	- task:
	type: text-generation
	name: Cognitive Arithmetic
	metrics:
	- type: exact_accuracy
	value: 56.9
	name: Exact Accuracy (%)
	- type: composite_score
	value: 79.2
	name: Composite Cognitive Score
	---

	# Progressive Cognitive Architecture — 1.5B Flat LoRA (English, Control)

	Control model — Qwen2.5-1.5B fine-tuned with all training data in a single pass (no phases, no pruning). Serves as the baseline for evaluating progressive training.

	## 📊 Results

	\| Metric \| Score \|
	\|--------\|-------\|
	\| Composite Score \| 79.2 \|
	\| Exact Accuracy \| 56.9% ± 6.4 \|
	\| Adversarial Robustness \| 81.3% ± 2.3 \|
	\| Delegation Accuracy \| 100.0% ± 0.0 \|
	\| Delegation Rate \| 58.7% ± 4.6 \|
	\| Magnitude Sense (OoM±1) \| 100.0% ± 0.0 \|
	\| Catastrophic Errors \| 0.0% ± 0.0 \|

	> Results: mean ± std over 3 seeds (42, 43, 44), 50 samples × 5 dimensions per seed.

	## ⚖️ Comparison with Dream LoRA

	\| Metric \| Flat (this) \| Dream \| Delta \|
	\|--------\|-------------\|-------\|-------\|
	\| Composite \| 79.2 \| 87.6 \| +8.4 \|
	\| Exact Accuracy \| 56.9% \| 69.4% \| +12.5pp \|
	\| Delegation Rate \| 58.7% \| 100.0% \| +41.3pp \|
	\| Number Sense \| 6.7% \| 60.7% \| +54.0pp \|

	The Dream model shows significantly stronger delegation and number sense, demonstrating that progressive training + SVD pruning adds cognitive capabilities beyond what flat training provides.

	## 🔧 Training Configuration

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Base Model \| Qwen/Qwen2.5-1.5B \|
	\| LoRA Rank \| 16 \|
	\| LoRA Alpha \| 32 \|
	\| LoRA Targets \| q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj \|
	\| Dropout \| 0.05 \|
	\| Training Data \| ~6,000 English arithmetic examples (all mixed in one pass) \|
	\| Hardware \| NVIDIA T4 16GB \|

	## 🚀 Quick Start

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel

	base_model = AutoModelForCausalLM.from_pretrained(
	"Qwen/Qwen2.5-1.5B", device_map="auto", torch_dtype="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-1.5B")

	model = PeftModel.from_pretrained(
	base_model,
	"dexmac/progressive-cognitive-baseline-lora-en"
	)

	messages = [{"role": "user", "content": "Calculate: 342 * 67"}]
	text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	inputs = tokenizer(text, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.1)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## 🔗 Related Models

	- [1.5B Dream LoRA](https://huggingface.co/dexmac/progressive-cognitive-dream-lora-en) — Progressive training + Dream Pruning (best model)
	- [3B Flat](https://huggingface.co/dexmac/progressive-cognitive-qwen3b-baseline-lora) — Same approach on Qwen2.5-3B
	- [Results Dataset](https://huggingface.co/datasets/dexmac/progressive-cognitive-results) — Raw evaluation data
	- [GitHub](https://github.com/dexmac221/progressive-cognitive) — Full source code

	## 📝 Citation

	```bibtex
	@software{progressive_cognitive_2026,
	author = {Dex Mac},
	title = {Progressive Cognitive Architecture for LLMs},
	year = {2026},
	url = {https://github.com/dexmac221/progressive-cognitive},
	version = {1.0.0}
	}
	```


	## 📄 License

	Apache 2.0