CauGym / README.md
mukie's picture
Update README.md
388aba5 verified
metadata
license: apache-2.0
base_model:
  - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
pipeline_tag: question-answering
metrics:
  - accuracy

OpenCausaLab/CauGym

CauGym model is a model trained via GRPO (Group Relative Policy Optimization) on VERL framework (https://github.com/verl-project/verl), and it is specialized for causal inference.

Model Details

  • Developed by: OpenCausaLab
  • Model type: LLM.
  • Language(s) (NLP): Englsih.

Model Sources

Evaluation

We have evaluated this model on CALM benchmark and CauGym benchmark, and the evaluation metric is accuracy.

Benchmark ATE CDE ETT NDE NIE PN PS
CALM 0.990 0.994 0.900 0.940 0.930 0.928 0.866
CauGym-rephrased 0.948 0.982 0.856 0.890 0.888 0.778 0.816
CauGym-ommitted 0.935 0.963 0.837 0.934 0.838 0.900 0.907
CauGym-deconfounding 0.976 0.986 0.854 0.572 0.872 0.952 0.848
CauGym-redundant 0.972 0.966 0.918 0.850 0.888 0.934 0.910
CauGym-insufficient 0.884 0.902 0.686 0.696 0.958 0.940 0.954

Citation

@misc{chen2026posttrainingtransformllmscausal,
      title={Can Post-Training Transform LLMs into Causal Reasoners?}, 
      author={Junqi Chen and Sirui Chen and Chaochao Lu},
      year={2026},
      eprint={2602.06337},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2602.06337}, 
}