mjbuehler commited on
Commit
a036343
·
verified ·
1 Parent(s): f29cc3c

Update README.md

Browse files

Updated arXiv reference; datasets

Files changed (1) hide show
  1. README.md +21 -17
README.md CHANGED
@@ -2,17 +2,20 @@
2
  license: apache-2.0
3
  base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
4
  tags:
5
- - reinforcement-learning
6
- - grpo
7
- - peft
8
- - lora
9
- - beam-mechanics
10
- - structural-engineering
11
- - math
12
- - reasoning
13
  language:
14
- - en
15
  pipeline_tag: text-generation
 
 
 
16
  ---
17
 
18
  # BeamPERL — DeepSeek-R1-Distill-Qwen-1.5B
@@ -65,15 +68,16 @@ LoRA adapters were trained using GRPO via the [BeamPERL framework](https://githu
65
  ## Citation
66
 
67
  ```bibtex
68
- @misc{hage2025beamperl,
69
- title={BeamPERL: Parameter-Efficient Reinforcement Learning for Verifiable Beam Mechanics Problem-Solving},
70
- author={Tarjei P. Hage and Markus J. Buehler},
71
- year={2025},
72
- archivePrefix={arXiv},
73
- primaryClass={cs.CL}
74
- }
 
75
  ```
76
 
77
  ## Acknowledgements
78
 
79
- Built upon [Tina](https://arxiv.org/abs/2504.15777) and [Open R1](https://github.com/huggingface/open-r1). Dataset generation uses a custom version of [SymBeam](https://github.com/amcc1996/symbeam).
 
2
  license: apache-2.0
3
  base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
4
  tags:
5
+ - reinforcement-learning
6
+ - grpo
7
+ - peft
8
+ - lora
9
+ - beam-mechanics
10
+ - structural-engineering
11
+ - math
12
+ - reasoning
13
  language:
14
+ - en
15
  pipeline_tag: text-generation
16
+ datasets:
17
+ - lamm-mit/BeamRL-TrainData
18
+ - lamm-mit/BeamRL-EvalData
19
  ---
20
 
21
  # BeamPERL — DeepSeek-R1-Distill-Qwen-1.5B
 
68
  ## Citation
69
 
70
  ```bibtex
71
+ @misc{hage2026beamperlparameterefficientrlverifiable,
72
+ title={BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structured Beam Mechanics Reasoning},
73
+ author={Tarjei Paule Hage and Markus J. Buehler},
74
+ year={2026},
75
+ eprint={2603.04124},
76
+ archivePrefix={arXiv},
77
+ primaryClass={cs.AI},
78
+ url={https://arxiv.org/abs/2603.04124},
79
  ```
80
 
81
  ## Acknowledgements
82
 
83
+ Built upon [Tina](https://arxiv.org/abs/2504.15777) and [Open R1](https://github.com/huggingface/open-r1). Dataset generation uses a custom version of [SymBeam](https://github.com/amcc1996/symbeam).