File size: 8,559 Bytes
f62f52e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 |
Tiny-GPT2 Text Generation Project Documentation
=============================================
This project enables students to run, fine-tune, and experiment with the `sshleifer/tiny-gpt2`
model locally on a laptop with 8GB or 16GB RAM, using CPU (GPU optional). The goal is to provide
hands-on experience with AI model workflows, including downloading, fine-tuning, and deploying a
text generation model via a web interface. This document covers all steps to set up and run the
project, with credits to the original model and organization.
---
1. Project Overview
The project uses the `sshleifer/tiny-gpt2` model, a lightweight version of GPT-2, for text generation.
It includes scripts to:
- Download model weights from Hugging Face.
- Test the model with a sample prompt.
- Fine-tune the model on a custom dataset.
- Deploy a web app to generate text interactively.
The setup is optimized for low-memory systems (8GB RAM) and defaults to CPU execution, but includes
instructions for GPU users.
---
2. Prerequisites
- Hardware: Laptop with at least 8GB RAM (16GB recommended). GPU (e.g., NVIDIA GTX) is optional;
scripts default to CPU.
- Operating System: Windows, macOS, or Linux.
- Software:
- Python 3.10.9 (recommended) or 3.9.10. Download from https://www.python.org/downloads/.
- Visual Studio Code (VS Code) for development (optional but recommended). Download from
https://code.visualstudio.com/.
- Hugging Face Account: Required to download model weights.
---
3. Step-by-Step Setup Instructions
.1. Obtain a Hugging Face Token
1. Go to https://huggingface.co/ and sign up or log in.
2. Navigate to https://huggingface.co/settings/tokens.
3. Click "New token", select "Read" or "Write" access, and copy the token
(e.g., hf_XXXXXXXXXXXXXXXXXXXXXXXXXX).
4. Store the token securely; you’ll use it in the download script.
3.2. Install Python
1. Download Python 3.10.9 from https://www.python.org/downloads/release/python-3109/.
2. Install Python, ensuring "Add Python to PATH" is checked.
3. Verify installation in a terminal:
```
python --version
```
Expected output: Python 3.10.9
3.3. Set Up a Virtual Environment
1. Open a terminal in your project folder (e.g., C:\Users\YourName\Documents\project).
2. Create a virtual environment:
```
python -m venv venv
```
3. Activate the virtual environment:
- Windows: `venv\Scripts\activate`
- macOS/Linux: `source venv/bin/activate`
4. Confirm activation (you’ll see `(venv)` in the terminal prompt).
3.4. Install Dependencies
1. In the activated virtual environment, create a file named `requirements.txt` with the following
content:
```
torch==2.3.0
transformers==4.38.2
huggingface_hub==0.22.2
datasets==2.21.0
numpy==1.26.4
matplotlib==3.8.3
flask==3.0.3
```
2. Install the libraries:
```
pip install -r requirements.txt
```
3. For GPU users (optional):
- Uninstall CPU PyTorch: `pip uninstall torch -y`
- Install GPU PyTorch: `pip install torch==2.3.0+cu121`
- Verify CUDA: `python -c "import torch; print(torch.cuda.is_available())"` (should print `True`).
Note: Scripts default to CPU, so GPU users don’t need to change this unless desired.
3.5. Download Model Weights
1. Create a folder named `dalle` (or any name) for the project.
2. Copy the `download_model.py` script from the repository (or create it):
```
from transformers import AutoModelForCausalLM, AutoTokenizer
from huggingface_hub import login
import os
hf_token = "YOUR_HUGGINGFACE_TOKEN" # Replace with your token
login(token=hf_token)
model_name = "sshleifer/tiny-gpt2"
save_directory = "./tiny-gpt2-model"
os.makedirs(save_directory, exist_ok=True)
model = AutoModelForCausalLM.from_pretrained(model_name, cache_dir=save_directory)
tokenizer = AutoTokenizer.from_pretrained(model_name, cache_dir=save_directory)
print(f"Model and tokenizer downloaded to {save_directory}")
```
3. Replace `YOUR_HUGGINGFACE_TOKEN` with your Hugging Face token.
4. Run the script:
```
python download_model.py
```
5. Verify the model files in
`tiny-gpt2-model/models--sshleifer--tiny-gpt2/snapshots/5f91d94bd9cd7190a9f3216ff93cd1dd95f2c7be`
(contains `config.json`, `pytorch_model.bin`, `vocab.json`, `merges.txt`).
3.6. Test the Model
1. Copy the `test_model.py` script from the repository to the `dalle` folder.
2. Run the script:
```
python test_model.py
```
3. Expected output: Generated text starting with "Once upon a time" (e.g., may be semi-coherent due
to the model’s small size).
3.7. Fine-Tune the Model
1. Create a `fine_tune` folder inside `dalle`:
```
mkdir fine_tune
cd fine_tune
```
2. Create a dataset file `sample_data.txt` (or use your own text). Example content:
```
Once upon a time, there was a brave knight who explored a magical forest.
The forest was filled with mystical creatures and ancient ruins.
The knight discovered a hidden treasure guarded by a wise dragon.
With courage and wisdom, the knight befriended the dragon and shared the treasure with the village.
```
3. Copy the `fine_tune_model.py` script from the repository to `fine_tune`.
4. Run the script:
```
python fine_tune_model.py
```
5. The script fine-tunes the model, saves it to `fine_tuned_model`, and generates a `loss_plot.png`
showing training loss.
6. Verify `fine_tuned_model` contains model files and check `loss_plot.png`.
3.8. Run the Web App
1. In the `fine_tune` folder, copy `app.py` and create a `templates` folder with `index.html` from the
repository.
2. Run the web app:
```
python app.py
```
3. Open a browser and go to `http://127.0.0.1:5000`.
4. Enter a prompt (e.g., "Once upon a time") and click "Generate Text" to see the output from the
fine-tuned model.
---
4. Notes for Students
- Model Limitations: `tiny-gpt2` is a small model, so generated text may not be highly coherent. For
better results, consider larger models like `gpt2` (requires more memory or GPU).
- Memory Management: On 8GB RAM systems, close other applications to free memory. The scripts use a
small batch size to minimize memory usage.
- GPU Support: Scripts default to CPU for compatibility. To use an NVIDIA GPU:
- Install `torch==2.3.0+cu121` (see step 3.4).
- Remove `os.environ["CUDA_VISIBLE_DEVICES"] = ""` from `fine_tune_model.py` and `app.py`.
- Change `use_cpu=True` to `use_cpu=False` in `fine_tune_model.py`.
- Experimentation: Try different prompts, datasets, or fine-tuning parameters (e.g., `num_train_epochs`,
`learning_rate`) to explore AI model behavior.
---
5. Troubleshooting
- Library Conflicts: Use the exact versions in `requirements.txt` to avoid issues.
- File Not Found: Ensure model files are in the correct path
(`tiny-gpt2-model/models--sshleifer--tiny-gpt2/snapshots/5f91d94bd9cd7190a9f3216ff93cd1dd95f2c7be`).
- Memory Errors: Reduce `max_length` in `fine_tune_model.py` (e.g., from 128 to 64) for 8GB RAM systems.
- Hugging Face Token Issues: Verify your token has "Read" or "Write" access at
https://huggingface.co/settings/tokens.
---
6. Credits and Attribution
- Original Model: `sshleifer/tiny-gpt2`, a distilled version of GPT-2, created by Steven Shleifer.
Available at https://huggingface.co/sshleifer/tiny-gpt2.
- Organization: Hugging Face, Inc. (https://huggingface.co/) provides the model weights, `transformers`
library, and `huggingface_hub` for model access.
- Project Creator: Remiai3 (GitHub/Hugging Face username). This project was developed to facilitate AI
learning and experimentation for students.
- AI Assistance: Grok 3, created by xAI (https://x.ai/), assisted in generating and debugging the code,
ensuring compatibility for low-resource systems.
---
7. Next Steps for Students
- Experiment with different datasets in `sample_data.txt` to fine-tune the model for specific tasks
(e.g., storytelling, dialogue).
- Modify `fine_tune_model.py` parameters (e.g., `learning_rate`, `num_train_epochs`) to study their
impact.
- Enhance `index.html` or `app.py` to add features like multiple prompt inputs or generation options.
- Explore larger models on Hugging Face (e.g., `gpt2-medium`) if you have a GPU or more RAM.
For questions or issues, contact Remiai3 via Hugging Face or check the repository for updates. |