File size: 1,425 Bytes
41c7175 17d7503 4ce1251 17d7503 4ce1251 17d7503 4ce1251 8e35789 4ce1251 17d7503 4ce1251 41a4fe2 4ce1251 17d7503 4ce1251 17d7503 7f3b831 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
---
base_model:
- meta-llama/Llama-3.1-8B-Instruct
library_name: peft
license: mit
datasets:
- Roihn/Einstein-Puzzles-Data
language:
- en
---
# Einstein-Puzzles
**Communication and Verification in LLM Agents towards Collaboration under Information Asymmetry** ([Arxiv](https://arxiv.org/abs/2510.25595))
*Run Peng\*, Ziqiao Ma\*, Amy Pang, Sikai Li, Zhang Xi-Jia, Yingzhuo Yu, Cristian-Paul Bara, Joyce Chai*
## Model Details
For all the model fine-tuning, we employ LoRA with a rank of 32, training with a global batch size of 128 and a learning rate of 2e-4 using a cosine decay schedule for 1 epoch. Fine-tuning is conducted using [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF), while FlashAttention-2 is used to speed up training. The process takes approximately 30 minutes on 4 A40 GPUs with 48GB RAM each.
This repo provides the fine-tuned model with full capability of information providing and seeking and chain-of-thought reasoning.
## Citation
```bibtex
@misc{peng2025communicationverificationllmagents,
title={Communication and Verification in LLM Agents towards Collaboration under Information Asymmetry},
author={Run Peng and Ziqiao Ma and Amy Pang and Sikai Li and Zhang Xi-Jia and Yingzhuo Yu and Cristian-Paul Bara and Joyce Chai},
year={2025},
eprint={2510.25595},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2510.25595},
}
```
|