Game of Thrones SLM (4.5M Params)

A custom Small Language Model (SLM) trained from scratch on A Song of Ice and Fire. It uses a custom GPT-style architecture with 4.5 million parameters.

Model Details

  • Architecture: Custom PyTorch Transformer
  • Vocab Size: 8,000 (Byte-Level BPE)
  • Context Window: 256 tokens
  • Training Device: NVIDIA P100

πŸš€ Quick Start (Auto-Download)

Since this model uses a custom architecture, you cannot use AutoModel. Instead, use the huggingface_hub library to download the files and run the code locally.

1. Install Requirements

pip install torch transformers tokenizers huggingface_hub

2. Download and run

Create a python script (e.g., run.py) and paste the following code. This will automatically download the model files and generate text.


import torch
from tokenizers import Tokenizer
from huggingface_hub import snapshot_download
import os
import sys

# --- 1. DOWNLOAD FILES ---
# This downloads model.py, got_tokenizer_v1.json, and best_model_params.pt to a local cache
repo_id = "aman0419/got-slm-4.5m" 
local_dir = "got_slm_model"

print(f"Downloading model from {repo_id}...")
model_path = snapshot_download(repo_id=repo_id, local_dir=local_dir)
sys.path.append(model_path) # Add download folder to path so we can import model.py

# --- 2. IMPORT CUSTOM MODEL ---
from model import SLM, get_default_config

# --- 3. LOAD MODEL ---
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Using device: {device}")

config = get_default_config()
model = SLM(config)

# Load weights (map_location handles CPU/GPU automatically)
weights_path = os.path.join(model_path, "best_model_params.pt")
model.load_state_dict(torch.load(weights_path, map_location=device))
model.to(device)
model.eval()

# --- 4. LOAD TOKENIZER ---
tokenizer_path = os.path.join(model_path, "got_tokenizer_v1.json")
tokenizer = Tokenizer.from_file(tokenizer_path)

# --- 5. GENERATE TEXT ---
input_text = "Tyrion Lannister poured a cup of wine"
print(f"\nGenerating text for prompt: '{input_text}'\n" + "-"*50)

# Encode
encoded = tokenizer.encode(input_text)
idx = torch.tensor(encoded.ids, dtype=torch.long, device=device).unsqueeze(0)

# Generate
with torch.no_grad():
    generated_ids = model.generate(idx, max_new_tokens=60, temperature=0.8)

# Decode
output_text = tokenizer.decode(generated_ids[0].tolist())
print(output_text)
print("-" * 50)
Downloads last month
88
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support