--- title: FirstLLM emoji: 💻 colorFrom: red colorTo: green sdk: gradio sdk_version: 5.49.1 app_file: app.py pinned: false license: mit --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference # Sentence Completion with GPT A Gradio web application for sentence completion using a custom GPT model architecture. This app can use either a trained model checkpoint or pretrained GPT-2 weights. ## Features - **Sentence Completion**: Generate text completions for any given prompt - **Customizable Generation**: Control generation parameters (temperature, top-k, max tokens) - **Model Flexibility**: Supports both saved trained models and pretrained GPT-2 - **Easy Deployment**: Ready for deployment on Hugging Face Spaces ## Model Architecture This app uses a custom GPT implementation based on the GPT-2 architecture: - **Parameters**: ~124M (for gpt2 base model) - **Vocab Size**: 50,257 tokens - **Block Size**: 1024 tokens (max sequence length) - **Architecture**: 12 layers, 12 attention heads, 768 embedding dimension ## Environment Setup ### Prerequisites - Python 3.8 or higher - pip (Python package manager) - (Optional) CUDA-enabled GPU for faster inference ### Step 1: Clone or Download the Repository ```bash git clone cd first_llm_124 ``` Or download and extract the project files to a directory. ### Step 2: Create a Virtual Environment (Recommended) Using a virtual environment helps avoid conflicts with other projects: **On Windows:** ```bash python -m venv venv venv\Scripts\activate ``` **On macOS/Linux:** ```bash python3 -m venv venv source venv/bin/activate ``` ### Step 3: Install Dependencies Install all required packages from the requirements file: ```bash pip install -r requirements.txt ``` Or install packages individually: ```bash pip install gradio>=4.0.0 pip install torch>=2.0.0 pip install transformers>=4.30.0 pip install tiktoken>=0.5.0 pip install huggingface_hub>=0.34.0 ``` ### Step 4: Verify Installation Verify that all packages are installed correctly: ```bash python -c "import torch; import gradio; import transformers; import tiktoken; print('All packages installed successfully!')" ``` ### Step 5: Prepare Model Directory (Optional) If you have a trained model, create a `model` directory and place your checkpoint there: ```bash mkdir model # Place your model.pth file in the model/ directory ``` ## Installation 1. Follow the [Environment Setup](#environment-setup) steps above 2. Ensure all dependencies are installed 3. (Optional) Place your trained model checkpoint in the `model/` directory ## Usage ### Running Locally ```bash python app.py ``` The app will start a local server. Open the provided URL in your browser. ### Model Loading The app automatically tries to load models in this order: 1. Saved checkpoint file (checks for: `./model/model.pth`, `model.pt`, `checkpoint.pth`, `checkpoint.pt`, `gpt_model.pth`) 2. Pretrained GPT-2 from Hugging Face (fallback) ### Saving a Trained Model If you have a trained model, you can save it using: ```python import torch import os # Create model directory if it doesn't exist os.makedirs('model', exist_ok=True) # After training your model, save the checkpoint checkpoint = { 'model_state_dict': model.state_dict(), 'config': { 'block_size': model.config.block_size, 'vocab_size': model.config.vocab_size, 'n_layer': model.config.n_layer, 'n_head': model.config.n_head, 'n_embd': model.config.n_embd, } } torch.save(checkpoint, './model/model.pth') print("Model saved successfully to ./model/model.pth!") ``` ### Loading a Saved Model Place your saved model checkpoint (`.pth` or `.pt` file) in the `model/` directory. The app will automatically detect and load it from `./model/model.pth`. ## Parameters - **Max Tokens**: Maximum number of tokens to generate (10-200) - **Top-K**: Sample from the top K most likely tokens (1-100). Lower values make the output more focused. - **Temperature**: Controls the randomness of the output (0.1-2.0). Lower values make the output more deterministic, higher values more creative. ## Project Structure ``` . ├── app.py # Gradio interface (main entry point) ├── model.py # GPT model architecture ├── inference.py # Model loading and text generation utilities ├── requirements.txt # Python dependencies ├── README.md # This file ├── llm_trainer.ipynb # Jupyter notebook for training ├── input.txt # Training data (optional) ├── model/ # (Optional) Directory for saved model checkpoints │ └── model.pth # Saved model checkpoint └── venv/ # Virtual environment (created during setup) ``` ## Deployment to Hugging Face Spaces 1. Create a new Space on [Hugging Face Spaces](https://huggingface.co/spaces) 2. Upload all files from this project (except `venv/` and `__pycache__/`) 3. Set the Space SDK to **Gradio** 4. Add your model checkpoint file in the `model/` directory (if using a trained model) 5. The Space will automatically install dependencies and launch the app ### For Hugging Face Spaces The app will automatically: - Use CPU or GPU if available - Load pretrained GPT-2 if no checkpoint is found - Handle model loading errors gracefully ## Model Training To train your own model, use the `llm_trainer.ipynb` notebook. After training, save the model: ```python import torch import os # Create model directory if it doesn't exist os.makedirs('model', exist_ok=True) # Save model checkpoint checkpoint = { 'model_state_dict': model.state_dict(), 'config': { 'block_size': model.config.block_size, 'vocab_size': model.config.vocab_size, 'n_layer': model.config.n_layer, 'n_head': model.config.n_head, 'n_embd': model.config.n_embd, } } torch.save(checkpoint, './model/model.pth') print("Model saved successfully!") ``` Then place `model.pth` in the `model/` directory for automatic loading. ## Troubleshooting ### Common Issues 1. **Import Errors**: - Ensure all dependencies are installed: `pip install -r requirements.txt` - Make sure your virtual environment is activated 2. **Model Not Found**: - Check that the model checkpoint is in the correct directory: `./model/model.pth` - Verify the file exists: `ls model/model.pth` (Linux/Mac) or `dir model\model.pth` (Windows) 3. **CUDA Out of Memory**: - The app will automatically fall back to CPU if GPU memory is insufficient - Reduce max_tokens parameter in the interface 4. **Module Not Found**: - Reinstall dependencies: `pip install -r requirements.txt --upgrade` - Check Python version: `python --version` (should be 3.8+) 5. **Port Already in Use**: - Change the port in `app.py`: `demo.launch(server_port=7861)` - Or stop the process using the port ## License This project uses the GPT-2 architecture and can load pretrained GPT-2 weights from Hugging Face, which are subject to OpenAI's GPT-2 license. ## Notes - The model uses tiktoken's 'gpt2' encoding - Generation uses top-k sampling with temperature control - Maximum sequence length is 1024 tokens