Syndra โ Grammar & Essay SLM
A small language model (~30MB) fine-tuned for grammar correction and essay generation. Built from scratch in one day on an RTX 3050 4GB GPU.
Model details
| Property | Value |
|---|---|
| Parameters | 16.08M |
| File size | ~30MB |
| Architecture | 4-layer transformer decoder |
| Heads | 4 attention heads |
| Dimension | 256 |
| Tokenizer | GPT-2 BPE (tiktoken) |
| Val loss | 1.7955214977264404 |
| Task | Grammar correction + Essay generation |
How to use
import torch
from model import GPT, GPTConfig
import tiktoken
# Load model
ckpt = torch.load('model.pt', map_location='cpu')
config = GPTConfig(**ckpt['model_args'])
model = GPT(config)
sd = {k: v.float() for k, v in ckpt['model'].items()}
model.load_state_dict(sd, strict=False)
model.eval()
enc = tiktoken.get_encoding('gpt2')
# Grammar correction
prompt = "### Task: Grammar Correction\n### Input: she go to school\n### Output:"
ids = enc.encode(prompt)
x = torch.tensor(ids).unsqueeze(0)
with torch.no_grad():
out = model.generate(x, max_new_tokens=100, temperature=0.3, top_k=40)
print(enc.decode(out[0].tolist()))
# Essay generation
prompt = "### Task: Essay Writing\n### Topic: The importance of books\n### Essay:"
ids = enc.encode(prompt)
x = torch.tensor(ids).unsqueeze(0)
with torch.no_grad():
out = model.generate(x, max_new_tokens=400, temperature=0.75, top_k=40)
print(enc.decode(out[0].tolist()))
Training
- Base pretrained on TinyStories (10,000 steps)
- Fine-tuned on JFLEG grammar dataset + curated essays (3,000 steps)
- Hardware: RTX 3050 4GB VRAM
- Framework: PyTorch / nanoGPT architecture
Built for
OpenAI Parameter Golf competition โ targeting sub-16MB models with competitive bits-per-byte scores on enwik8.
- Downloads last month
- 133