SLM Question Generator - Pretrained (114M)

Architecture: Decoder-only Transformer
Parameters: 114.1 M
Layers: 12
d_model: 768
Attention: GQA (12 query / 4 KV heads)
Vocabulary: tiktoken r50k_base + 3 special tokens (50,260 total)
Context window: 4096 tokens

A 114-million-parameter Small Language Model pretrained from scratch on ~3B tokens of Wikipedia, OpenWebText2, Gutenberg, and Medium articles.

Model Details

This model uses the tiktoken library with the r50k_base encoding, plus <|im_start|>, <|im_end|>, and <|pad|> tokens.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support