SLM Question Generator - Pretrained (114M)

A 114-million-parameter Small Language Model pretrained from scratch on ~3B tokens of Wikipedia, OpenWebText2, Gutenberg, and Medium articles.

Model Details

  • Architecture: Decoder-only Transformer
  • Parameters: 114.1 M
  • Layers: 12
  • d_model: 768
  • Attention: GQA (12 query / 4 KV heads)
  • Vocabulary: tiktoken r50k_base + 3 special tokens (50,260 total)
  • Context window: 4096 tokens

Tokenizer Note

This model uses the tiktoken library with the r50k_base encoding, plus <|im_start|>, <|im_end|>, and <|pad|> tokens.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support