|
|
--- |
|
|
license: apache-2.0 |
|
|
library_name: transformers |
|
|
base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B |
|
|
pipeline_tag: text-generation |
|
|
tags: |
|
|
- causal-inference |
|
|
- causal-reasoning |
|
|
- reinforcement-learning |
|
|
- grpo |
|
|
--- |
|
|
|
|
|
# Can Post-Training Transform LLMs into Causal Reasoners? |
|
|
|
|
|
This repository contains the **CauGym-GRPO-14B** model, a causal inference agent developed through targeted post-training of a 14B parameter LLM. |
|
|
|
|
|
- **Paper:** [Can Post-Training Transform LLMs into Causal Reasoners?](https://huggingface.co/papers/2602.06337) |
|
|
- **Repository:** [GitHub - OpenCausaLab/CauGym](https://github.com/OpenCausaLab/CauGym) |
|
|
|
|
|
## Model Description |
|
|
The model was fine-tuned using **Group Relative Policy Optimization (GRPO)** on the CauGym dataset, which covers seven core causal inference tasks across interventional and counterfactual domains. The research demonstrates that targeted post-training enables smaller models to perform competitively with or even surpass much larger counterparts on complex causal tasks. |
|
|
|
|
|
### Key Features |
|
|
- **Backbone:** DeepSeek-R1-Distill-Qwen-14B. |
|
|
- **High Performance:** Achieves 93.5% accuracy on the CaLM benchmark, compared to 55.4% by OpenAI o3. |
|
|
- **Robustness:** Exhibits strong generalization under real-world conditions, such as distribution shifts and noisy data. |
|
|
- **Internalization:** Capable of independently recognizing and applying causal theorems like the Backdoor Criterion. |
|
|
|
|
|
## Citation |
|
|
If you use the CauGym dataset or reference this research, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{chen2026posttrainingtransformllmscausal, |
|
|
title={Can Post-Training Transform LLMs into Causal Reasoners?}, |
|
|
author={Junqi Chen and Sirui Chen and Chaochao Lu}, |
|
|
year={2026}, |
|
|
eprint={2602.06337}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.CL}, |
|
|
url={https://arxiv.org/abs/2602.06337}, |
|
|
} |
|
|
``` |