---
language: en
license: mit
tags:
  - gpt
  - transformer
  - text-generation
  - miniGPT
model-index:
  - name: MiniGPT
    results: []
---

#  MiniGPT — Lightweight Transformer for Text Generation

**MiniGPT** is a minimal yet powerful GPT-style language model built from scratch using PyTorch. It is designed for educational clarity, customization, and efficient real-time text generation. This project demonstrates the full training and inference pipeline of a decoder-only transformer architecture, including streaming capabilities and modern sampling strategies.

>  Hosted with ❤️ by [@Austin207](https://huggingface.co/Austin207)

---

##  Model Description

MiniGPT is a small, word-level transformer model with the following architecture:

*  4 Transformer layers
*  4 Attention heads
*  128 Embedding dimensions
*  512 FFN hidden size
*  Max sequence length: 128
*  Word-level tokenizer (trained with Hugging Face `tokenizers`)

Despite its size, it supports advanced generation strategies including:

*  Repetition Penalty
*  Temperature Sampling
*  Top-K & Top-P (nucleus) sampling
*  Real-time streaming output

---

##  Usage

Install dependencies:

```bash
pip install torch tokenizers
```

Load the model and tokenizer:

```python
from miniGPT import MiniGPT
from inference import generate_stream
from tokenizers import Tokenizer
import torch

# Load tokenizer
tokenizer = Tokenizer.from_file("wordlevel.json")

# Load model
model = MiniGPT(
    vocab_size=tokenizer.get_vocab_size(),
    embed_dim=128,
    num_heads=4,
    ff_dim=512,
    num_layers=4,
    max_seq_len=128
)

checkpoint = torch.load("model_checkpoint_step20000.pt")
model.load_state_dict(checkpoint["model_state_dict"])
model.eval()

# Generate text
prompt = "Beneath the ancient ruins"
generate_stream(model, tokenizer, prompt, max_new_tokens=60, temperature=1.0, top_k=50, top_p=0.9)
```

---

##  Training

Train from scratch on any plain-text dataset:

```bash
python training.py
```

Training includes:

*  Checkpointing
*  Sample generation previews
*  Word-level tokenization with `tokenizers`
*  Custom datasets via `alphabetical_dataset.txt` or your own

---

##  Files in This Repository

| File                       | Purpose                      |
| -------------------------- | ---------------------------- |
| `miniGPT.py`               | Core Transformer model       |
| `transformer.py`           | Transformer block logic      |
| `multiheadattention.py`    | Multi-head attention module  |
| `Tokenizer.py`             | Tokenizer loader             |
| `training.py`              | Training loop                |
| `inference.py`             | CLI and streaming generation |
| `dataprocess.py`           | Text preprocessing tools     |
| `wordlevel.json`           | Trained word-level tokenizer |
| `alphabetical_dataset.txt` | Sample dataset               |
| `requirements.txt`         | Required dependencies        |

---

##  Model Card

| Property     | Value                             |
| ------------ | --------------------------------- |
| Model Type   | Decoder-only GPT                  |
| Size         | Small (\~4.6M params)             |
| Trained On   | Word-level dataset (custom)       |
| Intended Use | Text generation, educational demo |
| License      | MIT                               |

---

##  Intended Use and Limitations

This model is meant for educational, experimental, and research purposes. It is not suitable for commercial or production use out-of-the-box. Expect limitations in coherence, factuality, and long-context reasoning.

---

##  Contributions

We welcome improvements, bug fixes, and new features!

```bash
# Fork, clone, and create a branch
git clone https://github.com/austin207/Transformer-Virtue-v2.git
cd Transformer-Virtue-v2
git checkout -b feature/your-feature
```

Then open a pull request!

---

##  License

This project is licensed under the [MIT License](https://github.com/austin207/Transformer-Virtue-v2/blob/main/LICENSE).

---

##  Explore More

*  Based on GPT architecture from OpenAI
*  Inspired by [karpathy/nanoGPT](https://github.com/karpathy/nanoGPT)
*  Compatible with Hugging Face tools and tokenizer ecosystem