kkolomeitsev
/

llm-modules

Model card Files Files and versions

llm-modules / README.md

kkolomeitsev's picture

chore: update readme

6e149dd verified 12 months ago

|

history blame contribute delete

3.26 kB

	---
	license: mit
	datasets:
	- bespokelabs/Bespoke-Stratos-17k
	language:
	- en
	base_model:
	- Qwen/Qwen2-1.5B
	- EleutherAI/gpt-neo-125m
	---

	# LLM Modules: Knowledge Transfer from a Large to a Small Model using Enhanced Cross-Attention

	This repository contains the source code for a paired LLM model that transfers knowledge from a large pre-trained model (Qwen2-1.5B) to a smaller model (GPT-Neo-125M) using an Enhanced Cross-Attention mechanism.

	Link to article (GitHub Pages): [How to teach a model to reason without retraining it for less than $10](https://k-kolomeitsev.github.io/LLM-Modules/)

	## Repository Contents

	- model.py: Source code for the paired model, including the implementation of training and inference routines.
	- model_checkpoint.pth: Model weights checkpoint saved after training.
	- compare-responses-from-models.md: Contains answers to test questions for different models (necessary for my research)
	- paper_llm_modules.pdf: LLM Module Research Paper

	## Requirements

	The project requires the following libraries:

	- Python 3.7 or higher
	- [PyTorch](https://pytorch.org/) (version 1.7 or above)
	- [Transformers](https://huggingface.co/transformers/)
	- [Datasets](https://huggingface.co/docs/datasets/)
	- [tqdm](https://github.com/tqdm/tqdm)

	You can install the required packages via pip:

	```bash
	pip install torch transformers datasets tqdm
	```

	Alternatively, you can create and activate a virtual environment:

	```bash
	python -m venv venv
	# For Linux/MacOS:
	source venv/bin/activate
	# For Windows:
	venv\Scripts\activate
	pip install torch transformers datasets tqdm
	```

	## Running Training

	By default, the model.py file is configured to run the training process. To start training, simply execute:

	```bash
	python model.py
	```

	The model will be trained according to the specified parameters, and the checkpoint will be saved as `model_checkpoint.pth`.

	## Running Inference (Interactive Chat)

	To run inference, you need to disable the training code and enable the interactive chat mode. In the model.py file, comment out the training function call and uncomment the interactive_chat() call. For example, modify the main section as follows:

	```bash
	if __name__ == "__main__":
	# main() # Comment this line to disable training
	interactive_chat() # Uncomment this line to run inference
	```

	Then run:

	```bash
	python model.py
	```

	An interactive session will start in the console, allowing you to enter queries and view the model's generated responses.
	Additional Notes

	- Ensure you have sufficient computational resources for training the model.
	- For reproducibility, consider setting a fixed seed for random operations.
	- You can adjust model parameters and training settings directly in the model.py file.

	## Citation

	```
	@misc{Kolomeitsev2025LLMModules,
	title = {LLM Modules: Knowledge Transfer from a Large to a Small Model using Enhanced Cross-Attention},
	author = {Konstantin Kolomeitsev},
	year = {2025},
	eprint={2502.08213},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2502.08213}
	}
	```

	## Contact

	If you have any questions, please raise an issue or contact with me [uol92kot@gmail.com](uol92kot@gmail.com).