Euroswarms
/

CR-CA

@@ -1,92 +1,99 @@
 ---
-library_name: transformers
 license: other
-base_model: Euroswarms/CR-CA
 tags:
-- generated_from_trainer
-datasets:
-- mmlu.jsonl
-model-index:
-- name: outputs/crca2
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
-<details><summary>See axolotl config</summary>
-axolotl version: `0.13.0.dev0`
-```yaml
-base_model: Euroswarms/CR-CA
-bf16: auto
-datasets:
-  - path: mmlu.jsonl
-    type: chat_template
-    message_field: messages
-micro_batch_size: 1
-gradient_accumulation_steps: 8
-sequence_len: 4096
-gradient_checkpointing: true
-learning_rate: 0.0001
-optimizer: adamw_bnb_8bit
-train_on_inputs: true
-num_epochs: 2
-output_dir: ./outputs/crca2
 ```
-</details><br>
-# outputs/crca2
-This model is a fine-tuned version of [Euroswarms/CR-CA](https://huggingface.co/Euroswarms/CR-CA) on the mmlu.jsonl dataset.
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 0.0001
-- train_batch_size: 1
-- eval_batch_size: 1
-- seed: 42
-- distributed_type: multi-GPU
-- num_devices: 2
-- gradient_accumulation_steps: 8
-- total_train_batch_size: 16
-- total_eval_batch_size: 2
-- optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: cosine
-- lr_scheduler_warmup_steps: 2
-- training_steps: 49
-### Training results
-### Framework versions
-- Transformers 4.57.0
-- Pytorch 2.7.1+cu126
-- Datasets 4.0.0
-- Tokenizers 0.22.1

 ---
+language:
+  - en
 license: other
+base_model: Qwen/Qwen2.5-1.5B-Instruct
+library_name: transformers
+pipeline_tag: text-generation
 tags:
+  - crca
+  - causal-reasoning
+  - qwen2
+  - 1.5b
+  - finetuned
 ---
+# CRCA 1.5B Full Finetune
+## Overview
+CR-CA (Causal Reasoning and Counterfactual Analysis) is a reasoning-focused stack
+that targets structured causal analysis, counterfactuals, and multi-step reasoning.
+This 1.5B model is a CR-CA reasoning-optimized causal language model based on the
+Qwen2 architecture (`Qwen2ForCausalLM`).
+## Model Details
+- **Model type:** `qwen2`
+- **Architecture:** `Qwen2ForCausalLM`
+- **Hidden size:** `1536`
+- **Layers:** `28`
+- **Attention heads:** `12` (KV heads: `2`)
+- **Max position embeddings:** `32768`
+- **Vocab size:** `151936`
+- **Dtype:** `float16`
+## Training Summary
+This model was produced via full finetuning for CR-CA reasoning. Training metadata
+is stored in `training_args.bin`.
+Key training parameters:
+- **Per-device batch size:** 8
+- **Gradient accumulation:** 16
+- **Epochs:** 2
+- **Learning rate:** 5e-4
+- **Precision:** FP16
+- **DeepSpeed config:** `training/deepspeed_zero2_1_5b.json`
+- **Scheduler:** cosine
+- **Warmup steps:** 100
+- **Save steps:** 200
+## Training Data
+The training data uses a prompt/response JSONL format:
+```
+{"prompt": "...", "response": "..."}
 ```
+The dataset includes public reasoning data (e.g., GSM8K-style math word problems).
+This is used to strengthen multi-step reasoning, structured derivations, and final
+answer formatting.
+## Evaluation Report (Real-World Causal Tasks)
+Evaluation was run on 2026-02-01 using GPT-4o-mini over 6 real-world causal tasks.
+Overall score: **48.3%**.
+Per-task scores:
+- Monetary Policy Counterfactual (US Macro 2025): **55/100**
+- Tariff Pass-Through and Pricing (Beige Book + Firm Data): **55/100**
+- Supply Chain Reroute Counterfactual (Port Disruption): **45/100**
+- Inventory & Stockout Causal Impact (Retail): **25/100**
+- Inflation Drivers (World Bank CPI Data): **65/100**
+- Workforce Training Program (Labor Market Causal Impact): **45/100**
+Key strengths observed:
+- Clear task framing and attempt at counterfactual reasoning.
+- Some identification of confounders and causal factors.
+Key limitations observed:
+- Inconsistent causal graphs and directional effects.
+- Weak counterfactual grounding and numerical reasoning errors.
+- Limited depth and rigor on confounder adjustment strategies.
+## Intended Use
+For causal reasoning, counterfactual analysis, structured CR-CA reasoning prompts,
+and multi-step reasoning tasks.
+## Generation Settings
+Default generation parameters are stored in `generation_config.json`:
+- `do_sample`: `true`
+- `temperature`: `0.7`
+- `top_p`: `0.8`
+- `top_k`: `20`
+- `repetition_penalty`: `1.1`
+## Limitations
+- Outputs should be validated for factual correctness.
+- The model may hallucinate causal claims without evidence.
+## License
+Follow the base model and dataset licenses used for training. Add your explicit
+license here if required.