CR-CA / README.md
EuroswarmsInstitute's picture
Update README.md
4142335 verified
metadata
language:
  - en
license: other
base_model: Qwen/Qwen2.5-1.5B-Instruct
library_name: transformers
pipeline_tag: text-generation
tags:
  - crca
  - causal-reasoning
  - qwen2
  - 1.5b
  - finetuned

CRCA 1.5B Full Finetune

Overview

CR-CA (Causal Reasoning and Counterfactual Analysis) is a reasoning-focused stack that targets structured causal analysis, counterfactuals, and multi-step reasoning. This 1.5B model is a CR-CA reasoning-optimized causal language model based on the Qwen2 architecture (Qwen2ForCausalLM).

Model Details

  • Model type: qwen2
  • Architecture: Qwen2ForCausalLM
  • Hidden size: 1536
  • Layers: 28
  • Attention heads: 12 (KV heads: 2)
  • Max position embeddings: 32768
  • Vocab size: 151936
  • Dtype: float16

Training Summary

This model was produced via full finetuning for CR-CA reasoning. Training metadata is stored in training_args.bin.

Key training parameters:

  • Per-device batch size: 8
  • Gradient accumulation: 16
  • Epochs: 2
  • Learning rate: 5e-4
  • Precision: FP16
  • DeepSpeed config: training/deepspeed_zero2_1_5b.json
  • Scheduler: cosine
  • Warmup steps: 100
  • Save steps: 200

Training Data

The training data uses a prompt/response JSONL format:

{"prompt": "...", "response": "..."}

The dataset includes public reasoning data (e.g., GSM8K-style math word problems). This is used to strengthen multi-step reasoning, structured derivations, and final answer formatting.

Evaluation Report (Real-World Causal Tasks)

Evaluation was run on 2026-02-01 using GPT-4o-mini over 6 real-world causal tasks. Overall score: 48.3%.

Per-task scores:

  • Monetary Policy Counterfactual (US Macro 2025): 55/100
  • Tariff Pass-Through and Pricing (Beige Book + Firm Data): 55/100
  • Supply Chain Reroute Counterfactual (Port Disruption): 45/100
  • Inventory & Stockout Causal Impact (Retail): 25/100
  • Inflation Drivers (World Bank CPI Data): 65/100
  • Workforce Training Program (Labor Market Causal Impact): 45/100

Key strengths observed:

  • Clear task framing and attempt at counterfactual reasoning.
  • Some identification of confounders and causal factors.

Key limitations observed:

  • Inconsistent causal graphs and directional effects.
  • Weak counterfactual grounding and numerical reasoning errors.
  • Limited depth and rigor on confounder adjustment strategies.

Intended Use

For causal reasoning, counterfactual analysis, structured CR-CA reasoning prompts, and multi-step reasoning tasks.

Generation Settings

Default generation parameters are stored in generation_config.json:

  • do_sample: true
  • temperature: 0.7
  • top_p: 0.8
  • top_k: 20
  • repetition_penalty: 1.1

Limitations

  • Outputs should be validated for factual correctness.
  • The model may hallucinate causal claims without evidence.

License

Follow the base model and dataset licenses used for training. Add your explicit license here if required.