Spaces:
Sleeping
Sleeping
| title: FirstLLM | |
| emoji: π» | |
| colorFrom: red | |
| colorTo: green | |
| sdk: gradio | |
| sdk_version: 5.49.1 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |
| # Sentence Completion with GPT | |
| A Gradio web application for sentence completion using a custom GPT model architecture. This app can use either a trained model checkpoint or pretrained GPT-2 weights. | |
| ## Features | |
| - **Sentence Completion**: Generate text completions for any given prompt | |
| - **Customizable Generation**: Control generation parameters (temperature, top-k, max tokens) | |
| - **Model Flexibility**: Supports both saved trained models and pretrained GPT-2 | |
| - **Easy Deployment**: Ready for deployment on Hugging Face Spaces | |
| ## Model Architecture | |
| This app uses a custom GPT implementation based on the GPT-2 architecture: | |
| - **Parameters**: ~124M (for gpt2 base model) | |
| - **Vocab Size**: 50,257 tokens | |
| - **Block Size**: 1024 tokens (max sequence length) | |
| - **Architecture**: 12 layers, 12 attention heads, 768 embedding dimension | |
| ## Environment Setup | |
| ### Prerequisites | |
| - Python 3.8 or higher | |
| - pip (Python package manager) | |
| - (Optional) CUDA-enabled GPU for faster inference | |
| ### Step 1: Clone or Download the Repository | |
| ```bash | |
| git clone <repository-url> | |
| cd first_llm_124 | |
| ``` | |
| Or download and extract the project files to a directory. | |
| ### Step 2: Create a Virtual Environment (Recommended) | |
| Using a virtual environment helps avoid conflicts with other projects: | |
| **On Windows:** | |
| ```bash | |
| python -m venv venv | |
| venv\Scripts\activate | |
| ``` | |
| **On macOS/Linux:** | |
| ```bash | |
| python3 -m venv venv | |
| source venv/bin/activate | |
| ``` | |
| ### Step 3: Install Dependencies | |
| Install all required packages from the requirements file: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| Or install packages individually: | |
| ```bash | |
| pip install gradio>=4.0.0 | |
| pip install torch>=2.0.0 | |
| pip install transformers>=4.30.0 | |
| pip install tiktoken>=0.5.0 | |
| pip install huggingface_hub>=0.34.0 | |
| ``` | |
| ### Step 4: Verify Installation | |
| Verify that all packages are installed correctly: | |
| ```bash | |
| python -c "import torch; import gradio; import transformers; import tiktoken; print('All packages installed successfully!')" | |
| ``` | |
| ### Step 5: Prepare Model Directory (Optional) | |
| If you have a trained model, create a `model` directory and place your checkpoint there: | |
| ```bash | |
| mkdir model | |
| # Place your model.pth file in the model/ directory | |
| ``` | |
| ## Installation | |
| 1. Follow the [Environment Setup](#environment-setup) steps above | |
| 2. Ensure all dependencies are installed | |
| 3. (Optional) Place your trained model checkpoint in the `model/` directory | |
| ## Usage | |
| ### Running Locally | |
| ```bash | |
| python app.py | |
| ``` | |
| The app will start a local server. Open the provided URL in your browser. | |
| ### Model Loading | |
| The app automatically tries to load models in this order: | |
| 1. Saved checkpoint file (checks for: `./model/model.pth`, `model.pt`, `checkpoint.pth`, `checkpoint.pt`, `gpt_model.pth`) | |
| 2. Pretrained GPT-2 from Hugging Face (fallback) | |
| ### Saving a Trained Model | |
| If you have a trained model, you can save it using: | |
| ```python | |
| import torch | |
| import os | |
| # Create model directory if it doesn't exist | |
| os.makedirs('model', exist_ok=True) | |
| # After training your model, save the checkpoint | |
| checkpoint = { | |
| 'model_state_dict': model.state_dict(), | |
| 'config': { | |
| 'block_size': model.config.block_size, | |
| 'vocab_size': model.config.vocab_size, | |
| 'n_layer': model.config.n_layer, | |
| 'n_head': model.config.n_head, | |
| 'n_embd': model.config.n_embd, | |
| } | |
| } | |
| torch.save(checkpoint, './model/model.pth') | |
| print("Model saved successfully to ./model/model.pth!") | |
| ``` | |
| ### Loading a Saved Model | |
| Place your saved model checkpoint (`.pth` or `.pt` file) in the `model/` directory. The app will automatically detect and load it from `./model/model.pth`. | |
| ## Parameters | |
| - **Max Tokens**: Maximum number of tokens to generate (10-200) | |
| - **Top-K**: Sample from the top K most likely tokens (1-100). Lower values make the output more focused. | |
| - **Temperature**: Controls the randomness of the output (0.1-2.0). Lower values make the output more deterministic, higher values more creative. | |
| ## Project Structure | |
| ``` | |
| . | |
| βββ app.py # Gradio interface (main entry point) | |
| βββ model.py # GPT model architecture | |
| βββ inference.py # Model loading and text generation utilities | |
| βββ requirements.txt # Python dependencies | |
| βββ README.md # This file | |
| βββ llm_trainer.ipynb # Jupyter notebook for training | |
| βββ input.txt # Training data (optional) | |
| βββ model/ # (Optional) Directory for saved model checkpoints | |
| β βββ model.pth # Saved model checkpoint | |
| βββ venv/ # Virtual environment (created during setup) | |
| ``` | |
| ## Deployment to Hugging Face Spaces | |
| 1. Create a new Space on [Hugging Face Spaces](https://huggingface.co/spaces) | |
| 2. Upload all files from this project (except `venv/` and `__pycache__/`) | |
| 3. Set the Space SDK to **Gradio** | |
| 4. Add your model checkpoint file in the `model/` directory (if using a trained model) | |
| 5. The Space will automatically install dependencies and launch the app | |
| ### For Hugging Face Spaces | |
| The app will automatically: | |
| - Use CPU or GPU if available | |
| - Load pretrained GPT-2 if no checkpoint is found | |
| - Handle model loading errors gracefully | |
| ## Model Training | |
| To train your own model, use the `llm_trainer.ipynb` notebook. After training, save the model: | |
| ```python | |
| import torch | |
| import os | |
| # Create model directory if it doesn't exist | |
| os.makedirs('model', exist_ok=True) | |
| # Save model checkpoint | |
| checkpoint = { | |
| 'model_state_dict': model.state_dict(), | |
| 'config': { | |
| 'block_size': model.config.block_size, | |
| 'vocab_size': model.config.vocab_size, | |
| 'n_layer': model.config.n_layer, | |
| 'n_head': model.config.n_head, | |
| 'n_embd': model.config.n_embd, | |
| } | |
| } | |
| torch.save(checkpoint, './model/model.pth') | |
| print("Model saved successfully!") | |
| ``` | |
| Then place `model.pth` in the `model/` directory for automatic loading. | |
| ## Troubleshooting | |
| ### Common Issues | |
| 1. **Import Errors**: | |
| - Ensure all dependencies are installed: `pip install -r requirements.txt` | |
| - Make sure your virtual environment is activated | |
| 2. **Model Not Found**: | |
| - Check that the model checkpoint is in the correct directory: `./model/model.pth` | |
| - Verify the file exists: `ls model/model.pth` (Linux/Mac) or `dir model\model.pth` (Windows) | |
| 3. **CUDA Out of Memory**: | |
| - The app will automatically fall back to CPU if GPU memory is insufficient | |
| - Reduce max_tokens parameter in the interface | |
| 4. **Module Not Found**: | |
| - Reinstall dependencies: `pip install -r requirements.txt --upgrade` | |
| - Check Python version: `python --version` (should be 3.8+) | |
| 5. **Port Already in Use**: | |
| - Change the port in `app.py`: `demo.launch(server_port=7861)` | |
| - Or stop the process using the port | |
| ## License | |
| This project uses the GPT-2 architecture and can load pretrained GPT-2 weights from Hugging Face, which are subject to OpenAI's GPT-2 license. | |
| ## Notes | |
| - The model uses tiktoken's 'gpt2' encoding | |
| - Generation uses top-k sampling with temperature control | |
| - Maximum sequence length is 1024 tokens | |