Spaces:
Sleeping
A newer version of the Gradio SDK is available:
6.5.1
title: FirstLLM
emoji: π»
colorFrom: red
colorTo: green
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: mit
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
Sentence Completion with GPT
A Gradio web application for sentence completion using a custom GPT model architecture. This app can use either a trained model checkpoint or pretrained GPT-2 weights.
Features
- Sentence Completion: Generate text completions for any given prompt
- Customizable Generation: Control generation parameters (temperature, top-k, max tokens)
- Model Flexibility: Supports both saved trained models and pretrained GPT-2
- Easy Deployment: Ready for deployment on Hugging Face Spaces
Model Architecture
This app uses a custom GPT implementation based on the GPT-2 architecture:
- Parameters: ~124M (for gpt2 base model)
- Vocab Size: 50,257 tokens
- Block Size: 1024 tokens (max sequence length)
- Architecture: 12 layers, 12 attention heads, 768 embedding dimension
Environment Setup
Prerequisites
- Python 3.8 or higher
- pip (Python package manager)
- (Optional) CUDA-enabled GPU for faster inference
Step 1: Clone or Download the Repository
git clone <repository-url>
cd first_llm_124
Or download and extract the project files to a directory.
Step 2: Create a Virtual Environment (Recommended)
Using a virtual environment helps avoid conflicts with other projects:
On Windows:
python -m venv venv
venv\Scripts\activate
On macOS/Linux:
python3 -m venv venv
source venv/bin/activate
Step 3: Install Dependencies
Install all required packages from the requirements file:
pip install -r requirements.txt
Or install packages individually:
pip install gradio>=4.0.0
pip install torch>=2.0.0
pip install transformers>=4.30.0
pip install tiktoken>=0.5.0
pip install huggingface_hub>=0.34.0
Step 4: Verify Installation
Verify that all packages are installed correctly:
python -c "import torch; import gradio; import transformers; import tiktoken; print('All packages installed successfully!')"
Step 5: Prepare Model Directory (Optional)
If you have a trained model, create a model directory and place your checkpoint there:
mkdir model
# Place your model.pth file in the model/ directory
Installation
- Follow the Environment Setup steps above
- Ensure all dependencies are installed
- (Optional) Place your trained model checkpoint in the
model/directory
Usage
Running Locally
python app.py
The app will start a local server. Open the provided URL in your browser.
Model Loading
The app automatically tries to load models in this order:
- Saved checkpoint file (checks for:
./model/model.pth,model.pt,checkpoint.pth,checkpoint.pt,gpt_model.pth) - Pretrained GPT-2 from Hugging Face (fallback)
Saving a Trained Model
If you have a trained model, you can save it using:
import torch
import os
# Create model directory if it doesn't exist
os.makedirs('model', exist_ok=True)
# After training your model, save the checkpoint
checkpoint = {
'model_state_dict': model.state_dict(),
'config': {
'block_size': model.config.block_size,
'vocab_size': model.config.vocab_size,
'n_layer': model.config.n_layer,
'n_head': model.config.n_head,
'n_embd': model.config.n_embd,
}
}
torch.save(checkpoint, './model/model.pth')
print("Model saved successfully to ./model/model.pth!")
Loading a Saved Model
Place your saved model checkpoint (.pth or .pt file) in the model/ directory. The app will automatically detect and load it from ./model/model.pth.
Parameters
- Max Tokens: Maximum number of tokens to generate (10-200)
- Top-K: Sample from the top K most likely tokens (1-100). Lower values make the output more focused.
- Temperature: Controls the randomness of the output (0.1-2.0). Lower values make the output more deterministic, higher values more creative.
Project Structure
.
βββ app.py # Gradio interface (main entry point)
βββ model.py # GPT model architecture
βββ inference.py # Model loading and text generation utilities
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ llm_trainer.ipynb # Jupyter notebook for training
βββ input.txt # Training data (optional)
βββ model/ # (Optional) Directory for saved model checkpoints
β βββ model.pth # Saved model checkpoint
βββ venv/ # Virtual environment (created during setup)
Deployment to Hugging Face Spaces
- Create a new Space on Hugging Face Spaces
- Upload all files from this project (except
venv/and__pycache__/) - Set the Space SDK to Gradio
- Add your model checkpoint file in the
model/directory (if using a trained model) - The Space will automatically install dependencies and launch the app
For Hugging Face Spaces
The app will automatically:
- Use CPU or GPU if available
- Load pretrained GPT-2 if no checkpoint is found
- Handle model loading errors gracefully
Model Training
To train your own model, use the llm_trainer.ipynb notebook. After training, save the model:
import torch
import os
# Create model directory if it doesn't exist
os.makedirs('model', exist_ok=True)
# Save model checkpoint
checkpoint = {
'model_state_dict': model.state_dict(),
'config': {
'block_size': model.config.block_size,
'vocab_size': model.config.vocab_size,
'n_layer': model.config.n_layer,
'n_head': model.config.n_head,
'n_embd': model.config.n_embd,
}
}
torch.save(checkpoint, './model/model.pth')
print("Model saved successfully!")
Then place model.pth in the model/ directory for automatic loading.
Troubleshooting
Common Issues
Import Errors:
- Ensure all dependencies are installed:
pip install -r requirements.txt - Make sure your virtual environment is activated
- Ensure all dependencies are installed:
Model Not Found:
- Check that the model checkpoint is in the correct directory:
./model/model.pth - Verify the file exists:
ls model/model.pth(Linux/Mac) ordir model\model.pth(Windows)
- Check that the model checkpoint is in the correct directory:
CUDA Out of Memory:
- The app will automatically fall back to CPU if GPU memory is insufficient
- Reduce max_tokens parameter in the interface
Module Not Found:
- Reinstall dependencies:
pip install -r requirements.txt --upgrade - Check Python version:
python --version(should be 3.8+)
- Reinstall dependencies:
Port Already in Use:
- Change the port in
app.py:demo.launch(server_port=7861) - Or stop the process using the port
- Change the port in
License
This project uses the GPT-2 architecture and can load pretrained GPT-2 weights from Hugging Face, which are subject to OpenAI's GPT-2 license.
Notes
- The model uses tiktoken's 'gpt2' encoding
- Generation uses top-k sampling with temperature control
- Maximum sequence length is 1024 tokens