dexmac's picture
Upload README.md with huggingface_hub
a96e7b5 verified
metadata
language: en
license: apache-2.0
library_name: peft
base_model: Qwen/Qwen2.5-3B
tags:
  - lora
  - peft
  - baseline
  - flat-training
  - math
  - arithmetic
  - control-group
datasets:
  - custom
pipeline_tag: text-generation
model-index:
  - name: progressive-cognitive-qwen3b-baseline-lora
    results:
      - task:
          type: text-generation
          name: Cognitive Arithmetic
        metrics:
          - type: exact_accuracy
            value: 60.4
            name: Exact Accuracy (%)
          - type: composite_score
            value: 78.5
            name: Composite Cognitive Score

Progressive Cognitive Architecture β€” 3B Flat LoRA (English, Control)

Control model β€” Qwen2.5-3B fine-tuned with all training data in a single pass (no phases, no pruning). Serves as the 3B baseline for evaluating progressive training.

πŸ“Š Results

Metric Score
Composite Score 78.5
Exact Accuracy 60.4% Β± 7.5
Adversarial Robustness 84.7% Β± 1.2
Delegation Accuracy 100.0% Β± 0.0
Delegation Rate 58.7% Β± 4.6
Magnitude Sense (OoMΒ±1) 84.0% Β± 4.0
Catastrophic Errors 0.0% Β± 0.0

Results: mean Β± std over 3 seeds (42, 43, 44), 50 samples Γ— 5 dimensions per seed.

βš–οΈ Comparison: Flat vs Dream at 3B

Metric 3B Flat (this) 3B Dream Delta
Composite 78.5 66.0 -12.5
Adversarial 84.7% 34.0% -50.7pp
Catastrophic 0.0% 41.3% +41.3pp

At 3B scale, flat training outperforms progressive Dream training β€” the inverse of the 1.5B result. This supports the hypothesis that SVD compression (rank 16β†’8) creates adapters too weak relative to the larger base model's weight space.

πŸ”§ Training Configuration

Parameter Value
Base Model Qwen/Qwen2.5-3B
LoRA Rank 16
LoRA Alpha 32
LoRA Targets q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Dropout 0.05
Training Data ~6,000 English arithmetic examples (all mixed in one pass)
Hardware NVIDIA T4 16GB

πŸš€ Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-3B", device_map="auto", torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-3B")

model = PeftModel.from_pretrained(
    base_model,
    "dexmac/progressive-cognitive-qwen3b-baseline-lora"
)

messages = [{"role": "user", "content": "Calculate: 342 * 67"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.1)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

πŸ”— Related Models

πŸ“ Citation

@software{progressive_cognitive_2026,
  author = {Dex Mac},
  title = {Progressive Cognitive Architecture for LLMs},
  year = {2026},
  url = {https://github.com/dexmac221/progressive-cognitive},
  version = {1.0.0}
}

πŸ“„ License

Apache 2.0