File size: 7,275 Bytes
71a75cd
 
 
 
 
 
 
 
 
 
 
 
 
0ede4e9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
---
title: FirstLLM
emoji: πŸ’»
colorFrom: red
colorTo: green
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: mit
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
# Sentence Completion with GPT

A Gradio web application for sentence completion using a custom GPT model architecture. This app can use either a trained model checkpoint or pretrained GPT-2 weights.

## Features

- **Sentence Completion**: Generate text completions for any given prompt
- **Customizable Generation**: Control generation parameters (temperature, top-k, max tokens)
- **Model Flexibility**: Supports both saved trained models and pretrained GPT-2
- **Easy Deployment**: Ready for deployment on Hugging Face Spaces

## Model Architecture

This app uses a custom GPT implementation based on the GPT-2 architecture:
- **Parameters**: ~124M (for gpt2 base model)
- **Vocab Size**: 50,257 tokens
- **Block Size**: 1024 tokens (max sequence length)
- **Architecture**: 12 layers, 12 attention heads, 768 embedding dimension

## Environment Setup

### Prerequisites

- Python 3.8 or higher
- pip (Python package manager)
- (Optional) CUDA-enabled GPU for faster inference

### Step 1: Clone or Download the Repository

```bash
git clone <repository-url>
cd first_llm_124
```

Or download and extract the project files to a directory.

### Step 2: Create a Virtual Environment (Recommended)

Using a virtual environment helps avoid conflicts with other projects:

**On Windows:**
```bash
python -m venv venv
venv\Scripts\activate
```

**On macOS/Linux:**
```bash
python3 -m venv venv
source venv/bin/activate
```

### Step 3: Install Dependencies

Install all required packages from the requirements file:

```bash
pip install -r requirements.txt
```

Or install packages individually:
```bash
pip install gradio>=4.0.0
pip install torch>=2.0.0
pip install transformers>=4.30.0
pip install tiktoken>=0.5.0
pip install huggingface_hub>=0.34.0
```

### Step 4: Verify Installation

Verify that all packages are installed correctly:

```bash
python -c "import torch; import gradio; import transformers; import tiktoken; print('All packages installed successfully!')"
```

### Step 5: Prepare Model Directory (Optional)

If you have a trained model, create a `model` directory and place your checkpoint there:

```bash
mkdir model
# Place your model.pth file in the model/ directory
```

## Installation

1. Follow the [Environment Setup](#environment-setup) steps above
2. Ensure all dependencies are installed
3. (Optional) Place your trained model checkpoint in the `model/` directory

## Usage

### Running Locally

```bash
python app.py
```

The app will start a local server. Open the provided URL in your browser.

### Model Loading

The app automatically tries to load models in this order:
1. Saved checkpoint file (checks for: `./model/model.pth`, `model.pt`, `checkpoint.pth`, `checkpoint.pt`, `gpt_model.pth`)
2. Pretrained GPT-2 from Hugging Face (fallback)

### Saving a Trained Model

If you have a trained model, you can save it using:

```python
import torch
import os

# Create model directory if it doesn't exist
os.makedirs('model', exist_ok=True)

# After training your model, save the checkpoint
checkpoint = {
    'model_state_dict': model.state_dict(),
    'config': {
        'block_size': model.config.block_size,
        'vocab_size': model.config.vocab_size,
        'n_layer': model.config.n_layer,
        'n_head': model.config.n_head,
        'n_embd': model.config.n_embd,
    }
}
torch.save(checkpoint, './model/model.pth')
print("Model saved successfully to ./model/model.pth!")
```

### Loading a Saved Model

Place your saved model checkpoint (`.pth` or `.pt` file) in the `model/` directory. The app will automatically detect and load it from `./model/model.pth`.

## Parameters

- **Max Tokens**: Maximum number of tokens to generate (10-200)
- **Top-K**: Sample from the top K most likely tokens (1-100). Lower values make the output more focused.
- **Temperature**: Controls the randomness of the output (0.1-2.0). Lower values make the output more deterministic, higher values more creative.

## Project Structure

```
.
β”œβ”€β”€ app.py              # Gradio interface (main entry point)
β”œβ”€β”€ model.py            # GPT model architecture
β”œβ”€β”€ inference.py        # Model loading and text generation utilities
β”œβ”€β”€ requirements.txt    # Python dependencies
β”œβ”€β”€ README.md          # This file
β”œβ”€β”€ llm_trainer.ipynb  # Jupyter notebook for training
β”œβ”€β”€ input.txt          # Training data (optional)
β”œβ”€β”€ model/             # (Optional) Directory for saved model checkpoints
β”‚   └── model.pth      # Saved model checkpoint
└── venv/              # Virtual environment (created during setup)
```

## Deployment to Hugging Face Spaces

1. Create a new Space on [Hugging Face Spaces](https://huggingface.co/spaces)
2. Upload all files from this project (except `venv/` and `__pycache__/`)
3. Set the Space SDK to **Gradio**
4. Add your model checkpoint file in the `model/` directory (if using a trained model)
5. The Space will automatically install dependencies and launch the app

### For Hugging Face Spaces

The app will automatically:
- Use CPU or GPU if available
- Load pretrained GPT-2 if no checkpoint is found
- Handle model loading errors gracefully

## Model Training

To train your own model, use the `llm_trainer.ipynb` notebook. After training, save the model:

```python
import torch
import os

# Create model directory if it doesn't exist
os.makedirs('model', exist_ok=True)

# Save model checkpoint
checkpoint = {
    'model_state_dict': model.state_dict(),
    'config': {
        'block_size': model.config.block_size,
        'vocab_size': model.config.vocab_size,
        'n_layer': model.config.n_layer,
        'n_head': model.config.n_head,
        'n_embd': model.config.n_embd,
    }
}
torch.save(checkpoint, './model/model.pth')
print("Model saved successfully!")
```

Then place `model.pth` in the `model/` directory for automatic loading.

## Troubleshooting

### Common Issues

1. **Import Errors**: 
   - Ensure all dependencies are installed: `pip install -r requirements.txt`
   - Make sure your virtual environment is activated

2. **Model Not Found**: 
   - Check that the model checkpoint is in the correct directory: `./model/model.pth`
   - Verify the file exists: `ls model/model.pth` (Linux/Mac) or `dir model\model.pth` (Windows)

3. **CUDA Out of Memory**: 
   - The app will automatically fall back to CPU if GPU memory is insufficient
   - Reduce max_tokens parameter in the interface

4. **Module Not Found**: 
   - Reinstall dependencies: `pip install -r requirements.txt --upgrade`
   - Check Python version: `python --version` (should be 3.8+)

5. **Port Already in Use**: 
   - Change the port in `app.py`: `demo.launch(server_port=7861)`
   - Or stop the process using the port

## License

This project uses the GPT-2 architecture and can load pretrained GPT-2 weights from Hugging Face, which are subject to OpenAI's GPT-2 license.

## Notes

- The model uses tiktoken's 'gpt2' encoding
- Generation uses top-k sampling with temperature control
- Maximum sequence length is 1024 tokens