--- language: en license: mit tags: - text-generation - causal-lm - randygpt - rust --- # randyGPT — model-s A GPT-style language model trained from scratch in Rust on Project Gutenberg. ## Model Details | | | |---|---| | Architecture | Transformer (causal LM) | | Parameters | 1.99M | | Layers | 8 | | Heads | 4 | | Embedding dim | 128 | | Context window | 256 tokens | | Vocab size | 1500 (BPE) | | Training iters | 1800 | | Best val loss | 4.4794 | ## Training Trained on ~103MB of cleaned Project Gutenberg text (114 public domain books) with BPE-1500 tokenization, AdamW optimizer, cosine LR decay, and ReduceLROnPlateau. Metal GPU via Candle on Apple Silicon. ## Usage ```python from modeling_randygpt import RandyGPTConfig, RandyGPTForCausalLM from tokenizer_randygpt import RandyGPTTokenizer from safetensors.torch import load_file import torch # Load cfg = RandyGPTConfig.from_pretrained("MonumentalSystems/randygpt-s") model = RandyGPTForCausalLM(cfg) state = load_file("model.safetensors") model.load_state_dict(state, strict=True) model.eval() tok = RandyGPTTokenizer.from_file("tokenizer.json") # Generate prompt = "Once upon a time" ids = torch.tensor([tok.encode(prompt)], dtype=torch.long) out_ids = model.generate_text(ids, max_new_tokens=200, temperature=0.8) print(tok.decode(out_ids[0].tolist())) ``` ## Source Trained with [randyGPT](https://github.com/MonumentalSystems/RandyGPT) — a GPT implementation in Rust with Metal GPU acceleration.