Spaces:

Krishnakanth1993
/

FirstLLM

Sleeping

App Files Files Community

FirstLLM / README.md

Krishnakanth1993

Initial commit

0ede4e9 3 months ago

preview code

raw

history blame contribute delete

7.28 kB

	---
	title: FirstLLM
	emoji: 💻
	colorFrom: red
	colorTo: green
	sdk: gradio
	sdk_version: 5.49.1
	app_file: app.py
	pinned: false
	license: mit
	---

	Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
	# Sentence Completion with GPT

	A Gradio web application for sentence completion using a custom GPT model architecture. This app can use either a trained model checkpoint or pretrained GPT-2 weights.

	## Features

	- Sentence Completion: Generate text completions for any given prompt
	- Customizable Generation: Control generation parameters (temperature, top-k, max tokens)
	- Model Flexibility: Supports both saved trained models and pretrained GPT-2
	- Easy Deployment: Ready for deployment on Hugging Face Spaces

	## Model Architecture

	This app uses a custom GPT implementation based on the GPT-2 architecture:
	- Parameters: ~124M (for gpt2 base model)
	- Vocab Size: 50,257 tokens
	- Block Size: 1024 tokens (max sequence length)
	- Architecture: 12 layers, 12 attention heads, 768 embedding dimension

	## Environment Setup

	### Prerequisites

	- Python 3.8 or higher
	- pip (Python package manager)
	- (Optional) CUDA-enabled GPU for faster inference

	### Step 1: Clone or Download the Repository

	```bash
	git clone <repository-url>
	cd first_llm_124
	```

	Or download and extract the project files to a directory.

	### Step 2: Create a Virtual Environment (Recommended)

	Using a virtual environment helps avoid conflicts with other projects:

	On Windows:
	```bash
	python -m venv venv
	venv\Scripts\activate
	```

	On macOS/Linux:
	```bash
	python3 -m venv venv
	source venv/bin/activate
	```

	### Step 3: Install Dependencies

	Install all required packages from the requirements file:

	```bash
	pip install -r requirements.txt
	```

	Or install packages individually:
	```bash
	pip install gradio>=4.0.0
	pip install torch>=2.0.0
	pip install transformers>=4.30.0
	pip install tiktoken>=0.5.0
	pip install huggingface_hub>=0.34.0
	```

	### Step 4: Verify Installation

	Verify that all packages are installed correctly:

	```bash
	python -c "import torch; import gradio; import transformers; import tiktoken; print('All packages installed successfully!')"
	```

	### Step 5: Prepare Model Directory (Optional)

	If you have a trained model, create a `model` directory and place your checkpoint there:

	```bash
	mkdir model
	# Place your model.pth file in the model/ directory
	```

	## Installation

	1. Follow the [Environment Setup](#environment-setup) steps above
	2. Ensure all dependencies are installed
	3. (Optional) Place your trained model checkpoint in the `model/` directory

	## Usage

	### Running Locally

	```bash
	python app.py
	```

	The app will start a local server. Open the provided URL in your browser.

	### Model Loading

	The app automatically tries to load models in this order:
	1. Saved checkpoint file (checks for: `./model/model.pth`, `model.pt`, `checkpoint.pth`, `checkpoint.pt`, `gpt_model.pth`)
	2. Pretrained GPT-2 from Hugging Face (fallback)

	### Saving a Trained Model

	If you have a trained model, you can save it using:

	```python
	import torch
	import os

	# Create model directory if it doesn't exist
	os.makedirs('model', exist_ok=True)

	# After training your model, save the checkpoint
	checkpoint = {
	'model_state_dict': model.state_dict(),
	'config': {
	'block_size': model.config.block_size,
	'vocab_size': model.config.vocab_size,
	'n_layer': model.config.n_layer,
	'n_head': model.config.n_head,
	'n_embd': model.config.n_embd,
	}
	}
	torch.save(checkpoint, './model/model.pth')
	print("Model saved successfully to ./model/model.pth!")
	```

	### Loading a Saved Model

	Place your saved model checkpoint (`.pth` or `.pt` file) in the `model/` directory. The app will automatically detect and load it from `./model/model.pth`.

	## Parameters

	- Max Tokens: Maximum number of tokens to generate (10-200)
	- Top-K: Sample from the top K most likely tokens (1-100). Lower values make the output more focused.
	- Temperature: Controls the randomness of the output (0.1-2.0). Lower values make the output more deterministic, higher values more creative.

	## Project Structure

	```
	.
	├── app.py # Gradio interface (main entry point)
	├── model.py # GPT model architecture
	├── inference.py # Model loading and text generation utilities
	├── requirements.txt # Python dependencies
	├── README.md # This file
	├── llm_trainer.ipynb # Jupyter notebook for training
	├── input.txt # Training data (optional)
	├── model/ # (Optional) Directory for saved model checkpoints
	│ └── model.pth # Saved model checkpoint
	└── venv/ # Virtual environment (created during setup)
	```

	## Deployment to Hugging Face Spaces

	1. Create a new Space on [Hugging Face Spaces](https://huggingface.co/spaces)
	2. Upload all files from this project (except `venv/` and `__pycache__/`)
	3. Set the Space SDK to Gradio
	4. Add your model checkpoint file in the `model/` directory (if using a trained model)
	5. The Space will automatically install dependencies and launch the app

	### For Hugging Face Spaces

	The app will automatically:
	- Use CPU or GPU if available
	- Load pretrained GPT-2 if no checkpoint is found
	- Handle model loading errors gracefully

	## Model Training

	To train your own model, use the `llm_trainer.ipynb` notebook. After training, save the model:

	```python
	import torch
	import os

	# Create model directory if it doesn't exist
	os.makedirs('model', exist_ok=True)

	# Save model checkpoint
	checkpoint = {
	'model_state_dict': model.state_dict(),
	'config': {
	'block_size': model.config.block_size,
	'vocab_size': model.config.vocab_size,
	'n_layer': model.config.n_layer,
	'n_head': model.config.n_head,
	'n_embd': model.config.n_embd,
	}
	}
	torch.save(checkpoint, './model/model.pth')
	print("Model saved successfully!")
	```

	Then place `model.pth` in the `model/` directory for automatic loading.

	## Troubleshooting

	### Common Issues

	1. Import Errors:
	- Ensure all dependencies are installed: `pip install -r requirements.txt`
	- Make sure your virtual environment is activated

	2. Model Not Found:
	- Check that the model checkpoint is in the correct directory: `./model/model.pth`
	- Verify the file exists: `ls model/model.pth` (Linux/Mac) or `dir model\model.pth` (Windows)

	3. CUDA Out of Memory:
	- The app will automatically fall back to CPU if GPU memory is insufficient
	- Reduce max_tokens parameter in the interface

	4. Module Not Found:
	- Reinstall dependencies: `pip install -r requirements.txt --upgrade`
	- Check Python version: `python --version` (should be 3.8+)

	5. Port Already in Use:
	- Change the port in `app.py`: `demo.launch(server_port=7861)`
	- Or stop the process using the port

	## License

	This project uses the GPT-2 architecture and can load pretrained GPT-2 weights from Hugging Face, which are subject to OpenAI's GPT-2 license.

	## Notes

	- The model uses tiktoken's 'gpt2' encoding
	- Generation uses top-k sampling with temperature control
	- Maximum sequence length is 1024 tokens