Ego 45M โ€” Pretrained Language Model

Model Summary

Ego 45M is a GPT-2 style decoder-only language model trained from scratch.

Key facts

  • Parameters: 44.9M
  • Context length: 1024 tokens
  • Tokenizer: GPT-2 BPE (tiktoken)
  • Hardware: NVIDIA H200
  • Data: FineWeb-Edu subset
  • Training tokens: ~392 million

Architecture

  • Decoder-only transformer
  • 12 layers
  • 12 attention heads
  • Embedding size: 768

How to use (Python)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "RameshRathod/ego-45m-pretrained"
)
tokenizer = AutoTokenizer.from_pretrained("gpt2")

prompt = "Every effort moves you"
inputs = tokenizer(prompt, return_tensors="pt")

out = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(out[0]))


Updated model metadata (format: pt) at Sun Jan 18 01:03:59 IST 2026
Downloads last month
39
Safetensors
Model size
70.6M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using RameshRathod/ego-45m-pretrained 1