randyGPT β€” model-s

A GPT-style language model trained from scratch in Rust on Project Gutenberg.

Model Details

Architecture Transformer (causal LM)
Parameters 1.99M
Layers 8
Heads 4
Embedding dim 128
Context window 256 tokens
Vocab size 1500 (BPE)
Training iters 19700
Best val loss 3.4604

Training

Trained on ~103MB of cleaned Project Gutenberg text (114 public domain books) with BPE-1500 tokenization, AdamW optimizer, cosine LR decay, and ReduceLROnPlateau. Metal GPU via Candle on Apple Silicon.

Usage

from modeling_randygpt import RandyGPTConfig, RandyGPTForCausalLM
from tokenizer_randygpt import RandyGPTTokenizer
from safetensors.torch import load_file
import torch

# Load
cfg   = RandyGPTConfig.from_pretrained("MonumentalSystems/randygpt-s")
model = RandyGPTForCausalLM(cfg)
state = load_file("model.safetensors")
model.load_state_dict(state, strict=True)
model.eval()

tok = RandyGPTTokenizer.from_file("tokenizer.json")

# Generate
prompt  = "Once upon a time"
ids     = torch.tensor([tok.encode(prompt)], dtype=torch.long)
out_ids = model.generate_text(ids, max_new_tokens=200, temperature=0.8)
print(tok.decode(out_ids[0].tolist()))

Source

Trained with randyGPT β€” a GPT implementation in Rust with Metal GPU acceleration.

Downloads last month
47
Safetensors
Model size
1.99M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Spaces using MonumentalSystems/randygpt-s 5