File size: 3,835 Bytes

03afb93

Tiny Llama Project Guide: Running TinyLlama-1.1B-Chat-v1.0 Locally

This document provides a step-by-step guide to run the TinyLlama-1.1B-Chat-v1.0 model locally on a laptop with 16GB RAM, i5 processor, and Windows OS. The guide includes setting up the environment, downloading the model, fine-tuning, and running a Flask-based chat UI.

---

System Requirements
Operating System: Windows
RAM: 16GB
Processor: Intel i5 or equivalent
Python Version: 3.10.9
- IDE: Visual Studio Code (VS Code)
- Internet: Required for downloading model and libraries

---

Step-by-Step Setup

1. Install Python 3.10.9
- Download and install Python 3.10.9 from https://www.python.org/downloads/release/python-3109/.
- Ensure Python and pip are added to your system PATH.

2. Set Up a Virtual Environment
- Open VS Code terminal in your project directory (e.g., C:\path\to\TinyLlama-1.1B).
- Run:
```
python -m venv venv
.\venv\Scripts\activate
```

3. Install Required Libraries
- In the activated virtual environment, run:
```
pip install transformers torch huggingface_hub datasets peft trl accelerate flask matplotlib
```
- This installs libraries for model handling, fine-tuning, Flask app, and plotting.

4. Download the TinyLlama Model
- Create a file `download_model.py` with the following code:
```python
from huggingface_hub import login, snapshot_download
login(token="YOUR_ACCESS_TOKEN_HERE")
snapshot_download(repo_id="TinyLlama/TinyLlama-1.1B-Chat-v1.0", local_dir="./tinyllama_model")
```
- Replace `YOUR_ACCESS_TOKEN_HERE` with your Hugging Face access token (get it from https://huggingface.co/settings/tokens).
- Run: `python download_model.py`
- Model weights will be saved in the `tinyllama_model` folder.

5. Run Inference with Flask UI
- Create a `finetune` folder in your project directory.
- Copy `app.py` and `templates/index.html` from the repository to the `finetune` folder.
- Run: `python app.py`
- Open http://127.0.0.1:5000 in your browser to access the chat UI.
- Enter prompts to interact with the model.

6. Fine-Tune the Model (Optional)
- In the `finetune` folder, ensure `dataset.json` and `finetune.py` are present.
- Run: `python finetune.py`
- Fine-tuned weights will be saved in `finetune/finetuned_weights`.
- Update `app.py` to point to `./finetuned_weights` for inference with the fine-tuned model.
- Check `loss_plot.png` for training loss visualization.

7. View Training Metrics
- After fine-tuning, check the console for final train loss and learning rate.
- Open `loss_plot.png` in the `finetune` folder for a graphical view of training loss.

---

Project Structure
- `tinyllama_model/`: Model weights downloaded from Hugging Face.
- `finetune/`: Contains fine-tuning scripts and fine-tuned weights.
- `dataset.json`: Small dataset for fine-tuning.
- `finetune.py`: Fine-tuning script with LoRA.
- `app.py`: Flask app for inference.
- `templates/index.html`: Chat UI.
- `loss_plot.png`: Training loss plot.
- `requirements.txt`: List of required libraries.
- `document.txt`: This guide.
- `README.md`: Project overview.

---

Attribution
- **Model**: TinyLlama-1.1B-Chat-v1.0
- **Source**: https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0
- **Organization**: TinyLlama
- **License**: Check the model's Hugging Face page for licensing details.

---

Notes
- Model weights are not included in this repository to respect licensing terms.
- Download the model directly from Hugging Face using your access token.
- Ensure sufficient disk space (~2-3GB) for model weights and fine-tuned weights.
- For support, refer to the TinyLlama Hugging Face page or community forums.