CauGym / README.md

mukie

Update README.md

388aba5 verified 3 days ago

preview code

raw

history blame contribute delete

1.9 kB

metadata

license: apache-2.0
base_model:
  - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
pipeline_tag: question-answering
metrics:
  - accuracy

OpenCausaLab/CauGym

CauGym model is a model trained via GRPO (Group Relative Policy Optimization) on VERL framework (https://github.com/verl-project/verl), and it is specialized for causal inference.

Model Details

Developed by: OpenCausaLab
Model type: LLM.
Language(s) (NLP): Englsih.

Model Sources

Repository: https://github.com/OpenCausaLab/CauGym
Paper : https://www.arxiv.org/abs/2602.06337

Evaluation

We have evaluated this model on CALM benchmark and CauGym benchmark, and the evaluation metric is accuracy.

Benchmark	ATE	CDE	ETT	NDE	NIE	PN	PS
CALM	0.990	0.994	0.900	0.940	0.930	0.928	0.866
CauGym-rephrased	0.948	0.982	0.856	0.890	0.888	0.778	0.816
CauGym-ommitted	0.935	0.963	0.837	0.934	0.838	0.900	0.907
CauGym-deconfounding	0.976	0.986	0.854	0.572	0.872	0.952	0.848
CauGym-redundant	0.972	0.966	0.918	0.850	0.888	0.934	0.910
CauGym-insufficient	0.884	0.902	0.686	0.696	0.958	0.940	0.954

Citation

@misc{chen2026posttrainingtransformllmscausal,
      title={Can Post-Training Transform LLMs into Causal Reasoners?}, 
      author={Junqi Chen and Sirui Chen and Chaochao Lu},
      year={2026},
      eprint={2602.06337},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2602.06337}, 
}