File size: 8,559 Bytes
f62f52e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
Tiny-GPT2 Text Generation Project Documentation
=============================================

This project enables students to run, fine-tune, and experiment with the `sshleifer/tiny-gpt2` 
model locally on a laptop with 8GB or 16GB RAM, using CPU (GPU optional). The goal is to provide 
hands-on experience with AI model workflows, including downloading, fine-tuning, and deploying a 
text generation model via a web interface. This document covers all steps to set up and run the 
project, with credits to the original model and organization.

---

1. Project Overview
The project uses the `sshleifer/tiny-gpt2` model, a lightweight version of GPT-2, for text generation. 
It includes scripts to:
- Download model weights from Hugging Face.
- Test the model with a sample prompt.
- Fine-tune the model on a custom dataset.
- Deploy a web app to generate text interactively.
The setup is optimized for low-memory systems (8GB RAM) and defaults to CPU execution, but includes 
instructions for GPU users.

---

2. Prerequisites
- Hardware: Laptop with at least 8GB RAM (16GB recommended). GPU (e.g., NVIDIA GTX) is optional; 
  scripts default to CPU.
- Operating System: Windows, macOS, or Linux.
- Software:
  - Python 3.10.9 (recommended) or 3.9.10. Download from https://www.python.org/downloads/.
  - Visual Studio Code (VS Code) for development (optional but recommended). Download from 
    https://code.visualstudio.com/.
- Hugging Face Account: Required to download model weights.

---

3. Step-by-Step Setup Instructions

.1. Obtain a Hugging Face Token
1. Go to https://huggingface.co/ and sign up or log in.
2. Navigate to https://huggingface.co/settings/tokens.
3. Click "New token", select "Read" or "Write" access, and copy the token 
   (e.g., hf_XXXXXXXXXXXXXXXXXXXXXXXXXX).
4. Store the token securely; you’ll use it in the download script.

3.2. Install Python
1. Download Python 3.10.9 from https://www.python.org/downloads/release/python-3109/.
2. Install Python, ensuring "Add Python to PATH" is checked.
3. Verify installation in a terminal:
   ```
   python --version
   ```
   Expected output: Python 3.10.9

3.3. Set Up a Virtual Environment
1. Open a terminal in your project folder (e.g., C:\Users\YourName\Documents\project).
2. Create a virtual environment:
   ```
   python -m venv venv
   ```
3. Activate the virtual environment:
   - Windows: `venv\Scripts\activate`
   - macOS/Linux: `source venv/bin/activate`
4. Confirm activation (you’ll see `(venv)` in the terminal prompt).

3.4. Install Dependencies
1. In the activated virtual environment, create a file named `requirements.txt` with the following 
   content:
   ```
   torch==2.3.0
   transformers==4.38.2
   huggingface_hub==0.22.2
   datasets==2.21.0
   numpy==1.26.4
   matplotlib==3.8.3
   flask==3.0.3
   ```
2. Install the libraries:
   ```
   pip install -r requirements.txt
   ```
3. For GPU users (optional):
   - Uninstall CPU PyTorch: `pip uninstall torch -y`
   - Install GPU PyTorch: `pip install torch==2.3.0+cu121`
   - Verify CUDA: `python -c "import torch; print(torch.cuda.is_available())"` (should print `True`).
   Note: Scripts default to CPU, so GPU users don’t need to change this unless desired.

3.5. Download Model Weights
1. Create a folder named `dalle` (or any name) for the project.
2. Copy the `download_model.py` script from the repository (or create it):
   ```
   from transformers import AutoModelForCausalLM, AutoTokenizer
   from huggingface_hub import login
   import os

   hf_token = "YOUR_HUGGINGFACE_TOKEN"  # Replace with your token
   login(token=hf_token)

   model_name = "sshleifer/tiny-gpt2"
   save_directory = "./tiny-gpt2-model"
   os.makedirs(save_directory, exist_ok=True)

   model = AutoModelForCausalLM.from_pretrained(model_name, cache_dir=save_directory)
   tokenizer = AutoTokenizer.from_pretrained(model_name, cache_dir=save_directory)
   print(f"Model and tokenizer downloaded to {save_directory}")
   ```
3. Replace `YOUR_HUGGINGFACE_TOKEN` with your Hugging Face token.
4. Run the script:
   ```
   python download_model.py
   ```
5. Verify the model files in 
`tiny-gpt2-model/models--sshleifer--tiny-gpt2/snapshots/5f91d94bd9cd7190a9f3216ff93cd1dd95f2c7be` 
 (contains `config.json`, `pytorch_model.bin`, `vocab.json`, `merges.txt`).

3.6. Test the Model
1. Copy the `test_model.py` script from the repository to the `dalle` folder.
2. Run the script:
   ```
   python test_model.py
   ```
3. Expected output: Generated text starting with "Once upon a time" (e.g., may be semi-coherent due 
   to the model’s small size).

3.7. Fine-Tune the Model
1. Create a `fine_tune` folder inside `dalle`:
   ```
   mkdir fine_tune
   cd fine_tune
   ```
2. Create a dataset file `sample_data.txt` (or use your own text). Example content:
   ```
   Once upon a time, there was a brave knight who explored a magical forest.
   The forest was filled with mystical creatures and ancient ruins.
   The knight discovered a hidden treasure guarded by a wise dragon.
   With courage and wisdom, the knight befriended the dragon and shared the treasure with the village.
   ```
3. Copy the `fine_tune_model.py` script from the repository to `fine_tune`.
4. Run the script:
   ```
   python fine_tune_model.py
   ```
5. The script fine-tunes the model, saves it to `fine_tuned_model`, and generates a `loss_plot.png` 
   showing training loss.
6. Verify `fine_tuned_model` contains model files and check `loss_plot.png`.

3.8. Run the Web App
1. In the `fine_tune` folder, copy `app.py` and create a `templates` folder with `index.html` from the 
   repository.
2. Run the web app:
   ```
   python app.py
   ```
3. Open a browser and go to `http://127.0.0.1:5000`.
4. Enter a prompt (e.g., "Once upon a time") and click "Generate Text" to see the output from the 
   fine-tuned model.

---

4. Notes for Students
- Model Limitations: `tiny-gpt2` is a small model, so generated text may not be highly coherent. For 
  better results, consider larger models like `gpt2` (requires more memory or GPU).
- Memory Management: On 8GB RAM systems, close other applications to free memory. The scripts use a 
  small batch size to minimize memory usage.
- GPU Support: Scripts default to CPU for compatibility. To use an NVIDIA GPU:
  - Install `torch==2.3.0+cu121` (see step 3.4).
  - Remove `os.environ["CUDA_VISIBLE_DEVICES"] = ""` from `fine_tune_model.py` and `app.py`.
  - Change `use_cpu=True` to `use_cpu=False` in `fine_tune_model.py`.
- Experimentation: Try different prompts, datasets, or fine-tuning parameters (e.g., `num_train_epochs`,
 `learning_rate`) to explore AI model behavior.

---

5. Troubleshooting
- Library Conflicts: Use the exact versions in `requirements.txt` to avoid issues.
- File Not Found: Ensure model files are in the correct path 
  (`tiny-gpt2-model/models--sshleifer--tiny-gpt2/snapshots/5f91d94bd9cd7190a9f3216ff93cd1dd95f2c7be`).
- Memory Errors: Reduce `max_length` in `fine_tune_model.py` (e.g., from 128 to 64) for 8GB RAM systems.
- Hugging Face Token Issues: Verify your token has "Read" or "Write" access at
  https://huggingface.co/settings/tokens.

---

6. Credits and Attribution
- Original Model: `sshleifer/tiny-gpt2`, a distilled version of GPT-2, created by Steven Shleifer. 
  Available at https://huggingface.co/sshleifer/tiny-gpt2.
- Organization: Hugging Face, Inc. (https://huggingface.co/) provides the model weights, `transformers`
  library, and `huggingface_hub` for model access.
- Project Creator: Remiai3 (GitHub/Hugging Face username). This project was developed to facilitate AI 
  learning and experimentation for students.
- AI Assistance: Grok 3, created by xAI (https://x.ai/), assisted in generating and debugging the code,
  ensuring compatibility for low-resource systems.

---

 7. Next Steps for Students
- Experiment with different datasets in `sample_data.txt` to fine-tune the model for specific tasks 
  (e.g., storytelling, dialogue).
- Modify `fine_tune_model.py` parameters (e.g., `learning_rate`, `num_train_epochs`) to study their 
  impact.
- Enhance `index.html` or `app.py` to add features like multiple prompt inputs or generation options.
- Explore larger models on Hugging Face (e.g., `gpt2-medium`) if you have a GPU or more RAM.

For questions or issues, contact Remiai3 via Hugging Face or check the repository for updates.