Tiny Llama Project Guide: Running TinyLlama-1.1B-Chat-v1.0 Locally

This document provides a step-by-step guide to run the TinyLlama-1.1B-Chat-v1.0 model locally on a laptop with 16GB RAM, i5 processor, and Windows OS. The guide includes setting up the environment, downloading the model, fine-tuning, and running a Flask-based chat UI.

---

System Requirements
Operating System: Windows
RAM: 16GB
Processor: Intel i5 or equivalent
Python Version: 3.10.9
- IDE: Visual Studio Code (VS Code)
- Internet: Required for downloading model and libraries

---

Step-by-Step Setup

1. Install Python 3.10.9
   - Download and install Python 3.10.9 from https://www.python.org/downloads/release/python-3109/.
   - Ensure Python and pip are added to your system PATH.

2. Set Up a Virtual Environment
   - Open VS Code terminal in your project directory (e.g., C:\path\to\TinyLlama-1.1B).
   - Run:
     ```
     python -m venv venv
     .\venv\Scripts\activate
     ```

3. Install Required Libraries
   - In the activated virtual environment, run:
     ```
     pip install transformers torch huggingface_hub datasets peft trl accelerate flask matplotlib
     ```
   - This installs libraries for model handling, fine-tuning, Flask app, and plotting.

4. Download the TinyLlama Model
   - Create a file `download_model.py` with the following code:
     ```python
     from huggingface_hub import login, snapshot_download
     login(token="YOUR_ACCESS_TOKEN_HERE")
     snapshot_download(repo_id="TinyLlama/TinyLlama-1.1B-Chat-v1.0", local_dir="./tinyllama_model")
     ```
   - Replace `YOUR_ACCESS_TOKEN_HERE` with your Hugging Face access token (get it from https://huggingface.co/settings/tokens).
   - Run: `python download_model.py`
   - Model weights will be saved in the `tinyllama_model` folder.

5. Run Inference with Flask UI
   - Create a `finetune` folder in your project directory.
   - Copy `app.py` and `templates/index.html` from the repository to the `finetune` folder.
   - Run: `python app.py`
   - Open http://127.0.0.1:5000 in your browser to access the chat UI.
   - Enter prompts to interact with the model.

6. Fine-Tune the Model (Optional)
   - In the `finetune` folder, ensure `dataset.json` and `finetune.py` are present.
   - Run: `python finetune.py`
   - Fine-tuned weights will be saved in `finetune/finetuned_weights`.
   - Update `app.py` to point to `./finetuned_weights` for inference with the fine-tuned model.
   - Check `loss_plot.png` for training loss visualization.

7. View Training Metrics
   - After fine-tuning, check the console for final train loss and learning rate.
   - Open `loss_plot.png` in the `finetune` folder for a graphical view of training loss.

---

Project Structure
- `tinyllama_model/`: Model weights downloaded from Hugging Face.
- `finetune/`: Contains fine-tuning scripts and fine-tuned weights.
  - `dataset.json`: Small dataset for fine-tuning.
  - `finetune.py`: Fine-tuning script with LoRA.
  - `app.py`: Flask app for inference.
  - `templates/index.html`: Chat UI.
  - `loss_plot.png`: Training loss plot.
- `requirements.txt`: List of required libraries.
- `document.txt`: This guide.
- `README.md`: Project overview.

---

Attribution
- **Model**: TinyLlama-1.1B-Chat-v1.0
- **Source**: https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0
- **Organization**: TinyLlama
- **License**: Check the model's Hugging Face page for licensing details.

---

Notes
- Model weights are not included in this repository to respect licensing terms.
- Download the model directly from Hugging Face using your access token.
- Ensure sufficient disk space (~2-3GB) for model weights and fine-tuned weights.
- For support, refer to the TinyLlama Hugging Face page or community forums.