---
title: FirstLLM
emoji: 💻
colorFrom: red
colorTo: green
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: mit
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
# Sentence Completion with GPT

A Gradio web application for sentence completion using a custom GPT model architecture. This app can use either a trained model checkpoint or pretrained GPT-2 weights.

## Features

- **Sentence Completion**: Generate text completions for any given prompt
- **Customizable Generation**: Control generation parameters (temperature, top-k, max tokens)
- **Model Flexibility**: Supports both saved trained models and pretrained GPT-2
- **Easy Deployment**: Ready for deployment on Hugging Face Spaces

## Model Architecture

This app uses a custom GPT implementation based on the GPT-2 architecture:
- **Parameters**: ~124M (for gpt2 base model)
- **Vocab Size**: 50,257 tokens
- **Block Size**: 1024 tokens (max sequence length)
- **Architecture**: 12 layers, 12 attention heads, 768 embedding dimension

## Environment Setup

### Prerequisites

- Python 3.8 or higher
- pip (Python package manager)
- (Optional) CUDA-enabled GPU for faster inference

### Step 1: Clone or Download the Repository

```bash
git clone <repository-url>
cd first_llm_124
```

Or download and extract the project files to a directory.

### Step 2: Create a Virtual Environment (Recommended)

Using a virtual environment helps avoid conflicts with other projects:

**On Windows:**
```bash
python -m venv venv
venv\Scripts\activate
```

**On macOS/Linux:**
```bash
python3 -m venv venv
source venv/bin/activate
```

### Step 3: Install Dependencies

Install all required packages from the requirements file:

```bash
pip install -r requirements.txt
```

Or install packages individually:
```bash
pip install gradio>=4.0.0
pip install torch>=2.0.0
pip install transformers>=4.30.0
pip install tiktoken>=0.5.0
pip install huggingface_hub>=0.34.0
```

### Step 4: Verify Installation

Verify that all packages are installed correctly:

```bash
python -c "import torch; import gradio; import transformers; import tiktoken; print('All packages installed successfully!')"
```

### Step 5: Prepare Model Directory (Optional)

If you have a trained model, create a `model` directory and place your checkpoint there:

```bash
mkdir model
# Place your model.pth file in the model/ directory
```

## Installation

1. Follow the [Environment Setup](#environment-setup) steps above
2. Ensure all dependencies are installed
3. (Optional) Place your trained model checkpoint in the `model/` directory

## Usage

### Running Locally

```bash
python app.py
```

The app will start a local server. Open the provided URL in your browser.

### Model Loading

The app automatically tries to load models in this order:
1. Saved checkpoint file (checks for: `./model/model.pth`, `model.pt`, `checkpoint.pth`, `checkpoint.pt`, `gpt_model.pth`)
2. Pretrained GPT-2 from Hugging Face (fallback)

### Saving a Trained Model

If you have a trained model, you can save it using:

```python
import torch
import os

# Create model directory if it doesn't exist
os.makedirs('model', exist_ok=True)

# After training your model, save the checkpoint
checkpoint = {
    'model_state_dict': model.state_dict(),
    'config': {
        'block_size': model.config.block_size,
        'vocab_size': model.config.vocab_size,
        'n_layer': model.config.n_layer,
        'n_head': model.config.n_head,
        'n_embd': model.config.n_embd,
    }
}
torch.save(checkpoint, './model/model.pth')
print("Model saved successfully to ./model/model.pth!")
```

### Loading a Saved Model

Place your saved model checkpoint (`.pth` or `.pt` file) in the `model/` directory. The app will automatically detect and load it from `./model/model.pth`.

## Parameters

- **Max Tokens**: Maximum number of tokens to generate (10-200)
- **Top-K**: Sample from the top K most likely tokens (1-100). Lower values make the output more focused.
- **Temperature**: Controls the randomness of the output (0.1-2.0). Lower values make the output more deterministic, higher values more creative.

## Project Structure

```
.
├── app.py              # Gradio interface (main entry point)
├── model.py            # GPT model architecture
├── inference.py        # Model loading and text generation utilities
├── requirements.txt    # Python dependencies
├── README.md          # This file
├── llm_trainer.ipynb  # Jupyter notebook for training
├── input.txt          # Training data (optional)
├── model/             # (Optional) Directory for saved model checkpoints
│   └── model.pth      # Saved model checkpoint
└── venv/              # Virtual environment (created during setup)
```

## Deployment to Hugging Face Spaces

1. Create a new Space on [Hugging Face Spaces](https://huggingface.co/spaces)
2. Upload all files from this project (except `venv/` and `__pycache__/`)
3. Set the Space SDK to **Gradio**
4. Add your model checkpoint file in the `model/` directory (if using a trained model)
5. The Space will automatically install dependencies and launch the app

### For Hugging Face Spaces

The app will automatically:
- Use CPU or GPU if available
- Load pretrained GPT-2 if no checkpoint is found
- Handle model loading errors gracefully

## Model Training

To train your own model, use the `llm_trainer.ipynb` notebook. After training, save the model:

```python
import torch
import os

# Create model directory if it doesn't exist
os.makedirs('model', exist_ok=True)

# Save model checkpoint
checkpoint = {
    'model_state_dict': model.state_dict(),
    'config': {
        'block_size': model.config.block_size,
        'vocab_size': model.config.vocab_size,
        'n_layer': model.config.n_layer,
        'n_head': model.config.n_head,
        'n_embd': model.config.n_embd,
    }
}
torch.save(checkpoint, './model/model.pth')
print("Model saved successfully!")
```

Then place `model.pth` in the `model/` directory for automatic loading.

## Troubleshooting

### Common Issues

1. **Import Errors**: 
   - Ensure all dependencies are installed: `pip install -r requirements.txt`
   - Make sure your virtual environment is activated

2. **Model Not Found**: 
   - Check that the model checkpoint is in the correct directory: `./model/model.pth`
   - Verify the file exists: `ls model/model.pth` (Linux/Mac) or `dir model\model.pth` (Windows)

3. **CUDA Out of Memory**: 
   - The app will automatically fall back to CPU if GPU memory is insufficient
   - Reduce max_tokens parameter in the interface

4. **Module Not Found**: 
   - Reinstall dependencies: `pip install -r requirements.txt --upgrade`
   - Check Python version: `python --version` (should be 3.8+)

5. **Port Already in Use**: 
   - Change the port in `app.py`: `demo.launch(server_port=7861)`
   - Or stop the process using the port

## License

This project uses the GPT-2 architecture and can load pretrained GPT-2 weights from Hugging Face, which are subject to OpenAI's GPT-2 license.

## Notes

- The model uses tiktoken's 'gpt2' encoding
- Generation uses top-k sampling with temperature control
- Maximum sequence length is 1024 tokens