FirstLLM / README.md
Krishnakanth1993's picture
Initial commit
0ede4e9

A newer version of the Gradio SDK is available: 6.5.1

Upgrade
metadata
title: FirstLLM
emoji: πŸ’»
colorFrom: red
colorTo: green
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: mit

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Sentence Completion with GPT

A Gradio web application for sentence completion using a custom GPT model architecture. This app can use either a trained model checkpoint or pretrained GPT-2 weights.

Features

  • Sentence Completion: Generate text completions for any given prompt
  • Customizable Generation: Control generation parameters (temperature, top-k, max tokens)
  • Model Flexibility: Supports both saved trained models and pretrained GPT-2
  • Easy Deployment: Ready for deployment on Hugging Face Spaces

Model Architecture

This app uses a custom GPT implementation based on the GPT-2 architecture:

  • Parameters: ~124M (for gpt2 base model)
  • Vocab Size: 50,257 tokens
  • Block Size: 1024 tokens (max sequence length)
  • Architecture: 12 layers, 12 attention heads, 768 embedding dimension

Environment Setup

Prerequisites

  • Python 3.8 or higher
  • pip (Python package manager)
  • (Optional) CUDA-enabled GPU for faster inference

Step 1: Clone or Download the Repository

git clone <repository-url>
cd first_llm_124

Or download and extract the project files to a directory.

Step 2: Create a Virtual Environment (Recommended)

Using a virtual environment helps avoid conflicts with other projects:

On Windows:

python -m venv venv
venv\Scripts\activate

On macOS/Linux:

python3 -m venv venv
source venv/bin/activate

Step 3: Install Dependencies

Install all required packages from the requirements file:

pip install -r requirements.txt

Or install packages individually:

pip install gradio>=4.0.0
pip install torch>=2.0.0
pip install transformers>=4.30.0
pip install tiktoken>=0.5.0
pip install huggingface_hub>=0.34.0

Step 4: Verify Installation

Verify that all packages are installed correctly:

python -c "import torch; import gradio; import transformers; import tiktoken; print('All packages installed successfully!')"

Step 5: Prepare Model Directory (Optional)

If you have a trained model, create a model directory and place your checkpoint there:

mkdir model
# Place your model.pth file in the model/ directory

Installation

  1. Follow the Environment Setup steps above
  2. Ensure all dependencies are installed
  3. (Optional) Place your trained model checkpoint in the model/ directory

Usage

Running Locally

python app.py

The app will start a local server. Open the provided URL in your browser.

Model Loading

The app automatically tries to load models in this order:

  1. Saved checkpoint file (checks for: ./model/model.pth, model.pt, checkpoint.pth, checkpoint.pt, gpt_model.pth)
  2. Pretrained GPT-2 from Hugging Face (fallback)

Saving a Trained Model

If you have a trained model, you can save it using:

import torch
import os

# Create model directory if it doesn't exist
os.makedirs('model', exist_ok=True)

# After training your model, save the checkpoint
checkpoint = {
    'model_state_dict': model.state_dict(),
    'config': {
        'block_size': model.config.block_size,
        'vocab_size': model.config.vocab_size,
        'n_layer': model.config.n_layer,
        'n_head': model.config.n_head,
        'n_embd': model.config.n_embd,
    }
}
torch.save(checkpoint, './model/model.pth')
print("Model saved successfully to ./model/model.pth!")

Loading a Saved Model

Place your saved model checkpoint (.pth or .pt file) in the model/ directory. The app will automatically detect and load it from ./model/model.pth.

Parameters

  • Max Tokens: Maximum number of tokens to generate (10-200)
  • Top-K: Sample from the top K most likely tokens (1-100). Lower values make the output more focused.
  • Temperature: Controls the randomness of the output (0.1-2.0). Lower values make the output more deterministic, higher values more creative.

Project Structure

.
β”œβ”€β”€ app.py              # Gradio interface (main entry point)
β”œβ”€β”€ model.py            # GPT model architecture
β”œβ”€β”€ inference.py        # Model loading and text generation utilities
β”œβ”€β”€ requirements.txt    # Python dependencies
β”œβ”€β”€ README.md          # This file
β”œβ”€β”€ llm_trainer.ipynb  # Jupyter notebook for training
β”œβ”€β”€ input.txt          # Training data (optional)
β”œβ”€β”€ model/             # (Optional) Directory for saved model checkpoints
β”‚   └── model.pth      # Saved model checkpoint
└── venv/              # Virtual environment (created during setup)

Deployment to Hugging Face Spaces

  1. Create a new Space on Hugging Face Spaces
  2. Upload all files from this project (except venv/ and __pycache__/)
  3. Set the Space SDK to Gradio
  4. Add your model checkpoint file in the model/ directory (if using a trained model)
  5. The Space will automatically install dependencies and launch the app

For Hugging Face Spaces

The app will automatically:

  • Use CPU or GPU if available
  • Load pretrained GPT-2 if no checkpoint is found
  • Handle model loading errors gracefully

Model Training

To train your own model, use the llm_trainer.ipynb notebook. After training, save the model:

import torch
import os

# Create model directory if it doesn't exist
os.makedirs('model', exist_ok=True)

# Save model checkpoint
checkpoint = {
    'model_state_dict': model.state_dict(),
    'config': {
        'block_size': model.config.block_size,
        'vocab_size': model.config.vocab_size,
        'n_layer': model.config.n_layer,
        'n_head': model.config.n_head,
        'n_embd': model.config.n_embd,
    }
}
torch.save(checkpoint, './model/model.pth')
print("Model saved successfully!")

Then place model.pth in the model/ directory for automatic loading.

Troubleshooting

Common Issues

  1. Import Errors:

    • Ensure all dependencies are installed: pip install -r requirements.txt
    • Make sure your virtual environment is activated
  2. Model Not Found:

    • Check that the model checkpoint is in the correct directory: ./model/model.pth
    • Verify the file exists: ls model/model.pth (Linux/Mac) or dir model\model.pth (Windows)
  3. CUDA Out of Memory:

    • The app will automatically fall back to CPU if GPU memory is insufficient
    • Reduce max_tokens parameter in the interface
  4. Module Not Found:

    • Reinstall dependencies: pip install -r requirements.txt --upgrade
    • Check Python version: python --version (should be 3.8+)
  5. Port Already in Use:

    • Change the port in app.py: demo.launch(server_port=7861)
    • Or stop the process using the port

License

This project uses the GPT-2 architecture and can load pretrained GPT-2 weights from Hugging Face, which are subject to OpenAI's GPT-2 license.

Notes

  • The model uses tiktoken's 'gpt2' encoding
  • Generation uses top-k sampling with temperature control
  • Maximum sequence length is 1024 tokens