--- language: en license: mit tags: - text-generation - causal-lm - randygpt - rust --- # randyGPT — model-ds2 A GPT-style language model trained from scratch in Rust on Project Gutenberg. ## Model Details | | | |---|---| | Architecture | Transformer (causal LM) | | Parameters | 2.90M | | Layers | 12 | | Heads | 4 | | Embedding dim | 128 | | Context window | 256 tokens | | Vocab size | 2000 (BPE) | | Training iters | 14375 | | Best val loss | 3.8242 | ## Training Trained on ~98MB of cleaned Project Gutenberg text (112 public domain books, v3 cleaning with Unicode normalization) with BPE-2000 tokenization, AdamW optimizer, cosine LR decay, ReduceLROnPlateau, dropout=0.1, and Metal GPU via Candle on Apple Silicon. ## Usage ```python from modeling_randygpt import RandyGPTConfig, RandyGPTForCausalLM from tokenizer_randygpt import RandyGPTTokenizer from safetensors.torch import load_file import torch # Load cfg = RandyGPTConfig.from_pretrained("MonumentalSystems/randygpt-ds2") model = RandyGPTForCausalLM(cfg) state = load_file("model.safetensors") model.load_state_dict(state, strict=True) model.eval() tok = RandyGPTTokenizer.from_file("tokenizer.json") # Generate prompt = "Once upon a time" ids = torch.tensor([tok.encode(prompt)], dtype=torch.long) out_ids = model.generate_text(ids, max_new_tokens=200, temperature=0.8) print(tok.decode(out_ids[0].tolist())) ``` ## Source Trained with [randyGPT](https://github.com/MonumentalSystems/RandyGPT) — a GPT implementation in Rust with Metal GPU acceleration.