uyu

This is a nanoGPT checkpoint converted to a Hugging Face GPT-2-compatible GPT2LMHeadModel.

The model weights load with:

from transformers import AutoTokenizer, GPT2LMHeadModel, pipeline

model = GPT2LMHeadModel.from_pretrained(".")
tokenizer = AutoTokenizer.from_pretrained(".", trust_remote_code=True)

pipe = pipeline(
    "text-generation",
    model="mente-ai/uyu-1-10M",
    trust_remote_code=True,
)

Use eos_token_id=3 during generation to stop at <STORY_END>.

The tokenizer is a SentencePiece model stored as uyu.model. It is not a standard GPT-2 byte-level BPE tokenizer.

Original checkpoint: ckpt.pt

Downloads last month: 386

Safetensors

Model size

9.94M params

Tensor type

F32