|
|
--- |
|
|
license: apache-2.0 |
|
|
library_name: pytorch |
|
|
tags: |
|
|
- language-model |
|
|
- causal-lm |
|
|
- gpt |
|
|
- from-scratch |
|
|
- educational |
|
|
pipeline_tag: text-generation |
|
|
framework: pytorch |
|
|
--- |
|
|
|
|
|
# SimBot GPT (Level 1) |
|
|
|
|
|
SimBot GPT is a **from-scratch GPT-style language model** implemented in **PyTorch**. |
|
|
This project is focused on **learning LLM internals**, not on instruction tuning or production use. |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Overview |
|
|
|
|
|
- **Architecture:** Decoder-only Transformer (GPT-like) |
|
|
- **Training Objective:** Causal Language Modeling |
|
|
- **Dataset:** Domain-specific text (Simdega / regional data) |
|
|
- **Purpose:** Educational (understanding how LLMs work internally) |
|
|
|
|
|
β οΈ This is a **base language model**, not instruction-tuned and not grounded with RAG. |
|
|
|
|
|
--- |
|
|
|
|
|
## Repository Contents |
|
|
|
|
|
- `simbot.safetensors` β model weights (safe & HF-recommended format) |
|
|
- `tokenizer.json` β BPE tokenizer |
|
|
- `config.json` β model hyperparameters |
|
|
- `model/simbot.py` β model architecture (PyTorch) |
|
|
|
|
|
--- |
|
|
|
|
|
## Requirements (Inference Only) |
|
|
|
|
|
The following packages are **required to load and run the model**: |
|
|
|
|
|
```txt |
|
|
torch==2.9.1 |
|
|
tokenizers==0.22.1 |
|
|
safetensors |
|
|
``` |
|
|
--- |
|
|
|
|
|
## Usage Example |
|
|
|
|
|
```python |
|
|
import json |
|
|
from safetensors.torch import load_file |
|
|
from tokenizers import Tokenizer |
|
|
from model.simbot import SIMGPT |
|
|
|
|
|
# Load tokenizer |
|
|
tokenizer = Tokenizer.from_file("tokenizer.json") |
|
|
|
|
|
# Load config |
|
|
with open("config.json") as f: |
|
|
cfg = json.load(f) |
|
|
|
|
|
# Build model |
|
|
model = SIMGPT( |
|
|
vocab_size=cfg["vocab_size"], |
|
|
block_size=cfg["block_size"], |
|
|
n_layers=cfg["n_layers"], |
|
|
n_heads=cfg["n_heads"], |
|
|
d_model=cfg["d_model"] |
|
|
) |
|
|
|
|
|
# Load weights |
|
|
state_dict = load_file("simbot.safetensors") |
|
|
model.load_state_dict(state_dict) |
|
|
model.eval() |
|
|
``` |
|
|
|
|
|
## Prompting the Model |
|
|
|
|
|
This model is a custom PyTorch implementation and does not support the Hugging Face inference widget. |
|
|
|
|
|
### Interactive Usage (Recommended) |
|
|
|
|
|
```bash |
|
|
python inference.py |
|
|
|
|
|
|