| license: llama3.1 | |
| # Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines | |
| This repository contains the models and datasets used in the paper *"Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"*. | |
| ## Models | |
| The `ckpt` folder contains 16 LoRA adapters that were fine-tuned for this research: | |
| - 6 Basic Executors | |
| - 3 Executor Composers | |
| - 7 Aligners | |
| The base model used for fine-tuning all of the above is [LLaMA 3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B). | |
| ## Datasets | |
| The datasets used for evaluating all models can be found in the `datasets/raw` folder. | |
| ## Usage | |
| Please refer to [GitHub page](https://github.com/NJUDeepEngine/CAEF) for details. | |
| ## Citation | |
| If you use CAEF for your research, please cite our [paper](https://arxiv.org/abs/2410.07896): | |
| ```bibtex | |
| @misc{lai2024executing, | |
| title={Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines}, | |
| author={Junyu Lai and Jiahe Xu and Yao Yang and Yunpeng Huang and Chun Cao and Jingwei Xu}, | |
| year={2024}, | |
| eprint={2410.07896}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.AI}, | |
| url={https://arxiv.org/abs/2410.07896}, | |
| } | |
| ``` |