Spaces:

Krishnakanth1993
/

FirstLLM

Sleeping

App Files Files Community

FirstLLM / README.md

Krishnakanth1993

Initial commit

0ede4e9 3 months ago

preview code

raw

history blame contribute delete

7.28 kB

A newer version of the Gradio SDK is available: 6.5.1

Upgrade

metadata

title: FirstLLM
emoji: 💻
colorFrom: red
colorTo: green
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: mit

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Sentence Completion with GPT

A Gradio web application for sentence completion using a custom GPT model architecture. This app can use either a trained model checkpoint or pretrained GPT-2 weights.

Features

Sentence Completion: Generate text completions for any given prompt
Customizable Generation: Control generation parameters (temperature, top-k, max tokens)
Model Flexibility: Supports both saved trained models and pretrained GPT-2
Easy Deployment: Ready for deployment on Hugging Face Spaces

Model Architecture

This app uses a custom GPT implementation based on the GPT-2 architecture:

Parameters: ~124M (for gpt2 base model)
Vocab Size: 50,257 tokens
Block Size: 1024 tokens (max sequence length)
Architecture: 12 layers, 12 attention heads, 768 embedding dimension

Environment Setup

Prerequisites

Python 3.8 or higher
pip (Python package manager)
(Optional) CUDA-enabled GPU for faster inference

Step 1: Clone or Download the Repository

git clone <repository-url>
cd first_llm_124

Or download and extract the project files to a directory.

Step 2: Create a Virtual Environment (Recommended)

Using a virtual environment helps avoid conflicts with other projects:

On Windows:

python -m venv venv
venv\Scripts\activate

On macOS/Linux:

python3 -m venv venv
source venv/bin/activate

Step 3: Install Dependencies

Install all required packages from the requirements file:

pip install -r requirements.txt

Or install packages individually:

pip install gradio>=4.0.0
pip install torch>=2.0.0
pip install transformers>=4.30.0
pip install tiktoken>=0.5.0
pip install huggingface_hub>=0.34.0

Step 4: Verify Installation

Verify that all packages are installed correctly:

python -c "import torch; import gradio; import transformers; import tiktoken; print('All packages installed successfully!')"

Step 5: Prepare Model Directory (Optional)

If you have a trained model, create a model directory and place your checkpoint there:

mkdir model
# Place your model.pth file in the model/ directory

Installation

Follow the Environment Setup steps above
Ensure all dependencies are installed
(Optional) Place your trained model checkpoint in the model/ directory

Usage

Running Locally

python app.py

The app will start a local server. Open the provided URL in your browser.

Model Loading

The app automatically tries to load models in this order:

Saved checkpoint file (checks for: ./model/model.pth, model.pt, checkpoint.pth, checkpoint.pt, gpt_model.pth)
Pretrained GPT-2 from Hugging Face (fallback)

Saving a Trained Model

If you have a trained model, you can save it using:

import torch
import os

# Create model directory if it doesn't exist
os.makedirs('model', exist_ok=True)

# After training your model, save the checkpoint
checkpoint = {
    'model_state_dict': model.state_dict(),
    'config': {
        'block_size': model.config.block_size,
        'vocab_size': model.config.vocab_size,
        'n_layer': model.config.n_layer,
        'n_head': model.config.n_head,
        'n_embd': model.config.n_embd,
    }
}
torch.save(checkpoint, './model/model.pth')
print("Model saved successfully to ./model/model.pth!")

Loading a Saved Model

Place your saved model checkpoint (.pth or .pt file) in the model/ directory. The app will automatically detect and load it from ./model/model.pth.

Parameters

Max Tokens: Maximum number of tokens to generate (10-200)
Top-K: Sample from the top K most likely tokens (1-100). Lower values make the output more focused.
Temperature: Controls the randomness of the output (0.1-2.0). Lower values make the output more deterministic, higher values more creative.

Project Structure

.
├── app.py              # Gradio interface (main entry point)
├── model.py            # GPT model architecture
├── inference.py        # Model loading and text generation utilities
├── requirements.txt    # Python dependencies
├── README.md          # This file
├── llm_trainer.ipynb  # Jupyter notebook for training
├── input.txt          # Training data (optional)
├── model/             # (Optional) Directory for saved model checkpoints
│   └── model.pth      # Saved model checkpoint
└── venv/              # Virtual environment (created during setup)

Deployment to Hugging Face Spaces

Create a new Space on Hugging Face Spaces
Upload all files from this project (except venv/ and __pycache__/)
Set the Space SDK to Gradio
Add your model checkpoint file in the model/ directory (if using a trained model)
The Space will automatically install dependencies and launch the app

For Hugging Face Spaces

The app will automatically:

Use CPU or GPU if available
Load pretrained GPT-2 if no checkpoint is found
Handle model loading errors gracefully

Model Training

To train your own model, use the llm_trainer.ipynb notebook. After training, save the model:

import torch
import os

# Create model directory if it doesn't exist
os.makedirs('model', exist_ok=True)

# Save model checkpoint
checkpoint = {
    'model_state_dict': model.state_dict(),
    'config': {
        'block_size': model.config.block_size,
        'vocab_size': model.config.vocab_size,
        'n_layer': model.config.n_layer,
        'n_head': model.config.n_head,
        'n_embd': model.config.n_embd,
    }
}
torch.save(checkpoint, './model/model.pth')
print("Model saved successfully!")

Then place model.pth in the model/ directory for automatic loading.

Troubleshooting

Common Issues

Import Errors:
- Ensure all dependencies are installed: pip install -r requirements.txt
- Make sure your virtual environment is activated
Model Not Found:
- Check that the model checkpoint is in the correct directory: ./model/model.pth
- Verify the file exists: ls model/model.pth (Linux/Mac) or dir model\model.pth (Windows)
CUDA Out of Memory:
- The app will automatically fall back to CPU if GPU memory is insufficient
- Reduce max_tokens parameter in the interface
Module Not Found:
- Reinstall dependencies: pip install -r requirements.txt --upgrade
- Check Python version: python --version (should be 3.8+)
Port Already in Use:
- Change the port in app.py: demo.launch(server_port=7861)
- Or stop the process using the port

License

This project uses the GPT-2 architecture and can load pretrained GPT-2 weights from Hugging Face, which are subject to OpenAI's GPT-2 license.

Notes

The model uses tiktoken's 'gpt2' encoding
Generation uses top-k sampling with temperature control
Maximum sequence length is 1024 tokens