| Tiny Llama Project Guide: Running TinyLlama-1.1B-Chat-v1.0 Locally | |
| This document provides a step-by-step guide to run the TinyLlama-1.1B-Chat-v1.0 model locally on a laptop with 16GB RAM, i5 processor, and Windows OS. The guide includes setting up the environment, downloading the model, fine-tuning, and running a Flask-based chat UI. | |
| --- | |
| System Requirements | |
| Operating System: Windows | |
| RAM: 16GB | |
| Processor: Intel i5 or equivalent | |
| Python Version: 3.10.9 | |
| - IDE: Visual Studio Code (VS Code) | |
| - Internet: Required for downloading model and libraries | |
| --- | |
| Step-by-Step Setup | |
| 1. Install Python 3.10.9 | |
| - Download and install Python 3.10.9 from https://www.python.org/downloads/release/python-3109/. | |
| - Ensure Python and pip are added to your system PATH. | |
| 2. Set Up a Virtual Environment | |
| - Open VS Code terminal in your project directory (e.g., C:\path\to\TinyLlama-1.1B). | |
| - Run: | |
| ``` | |
| python -m venv venv | |
| .\venv\Scripts\activate | |
| ``` | |
| 3. Install Required Libraries | |
| - In the activated virtual environment, run: | |
| ``` | |
| pip install transformers torch huggingface_hub datasets peft trl accelerate flask matplotlib | |
| ``` | |
| - This installs libraries for model handling, fine-tuning, Flask app, and plotting. | |
| 4. Download the TinyLlama Model | |
| - Create a file `download_model.py` with the following code: | |
| ```python | |
| from huggingface_hub import login, snapshot_download | |
| login(token="YOUR_ACCESS_TOKEN_HERE") | |
| snapshot_download(repo_id="TinyLlama/TinyLlama-1.1B-Chat-v1.0", local_dir="./tinyllama_model") | |
| ``` | |
| - Replace `YOUR_ACCESS_TOKEN_HERE` with your Hugging Face access token (get it from https://huggingface.co/settings/tokens). | |
| - Run: `python download_model.py` | |
| - Model weights will be saved in the `tinyllama_model` folder. | |
| 5. Run Inference with Flask UI | |
| - Create a `finetune` folder in your project directory. | |
| - Copy `app.py` and `templates/index.html` from the repository to the `finetune` folder. | |
| - Run: `python app.py` | |
| - Open http://127.0.0.1:5000 in your browser to access the chat UI. | |
| - Enter prompts to interact with the model. | |
| 6. Fine-Tune the Model (Optional) | |
| - In the `finetune` folder, ensure `dataset.json` and `finetune.py` are present. | |
| - Run: `python finetune.py` | |
| - Fine-tuned weights will be saved in `finetune/finetuned_weights`. | |
| - Update `app.py` to point to `./finetuned_weights` for inference with the fine-tuned model. | |
| - Check `loss_plot.png` for training loss visualization. | |
| 7. View Training Metrics | |
| - After fine-tuning, check the console for final train loss and learning rate. | |
| - Open `loss_plot.png` in the `finetune` folder for a graphical view of training loss. | |
| --- | |
| Project Structure | |
| - `tinyllama_model/`: Model weights downloaded from Hugging Face. | |
| - `finetune/`: Contains fine-tuning scripts and fine-tuned weights. | |
| - `dataset.json`: Small dataset for fine-tuning. | |
| - `finetune.py`: Fine-tuning script with LoRA. | |
| - `app.py`: Flask app for inference. | |
| - `templates/index.html`: Chat UI. | |
| - `loss_plot.png`: Training loss plot. | |
| - `requirements.txt`: List of required libraries. | |
| - `document.txt`: This guide. | |
| - `README.md`: Project overview. | |
| --- | |
| Attribution | |
| - **Model**: TinyLlama-1.1B-Chat-v1.0 | |
| - **Source**: https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0 | |
| - **Organization**: TinyLlama | |
| - **License**: Check the model's Hugging Face page for licensing details. | |
| --- | |
| Notes | |
| - Model weights are not included in this repository to respect licensing terms. | |
| - Download the model directly from Hugging Face using your access token. | |
| - Ensure sufficient disk space (~2-3GB) for model weights and fine-tuned weights. | |
| - For support, refer to the TinyLlama Hugging Face page or community forums. |