Csermely

The smallest coherent Hungarian language model. Part of the Emese model family.

Csermely is a 138M parameter decoder-only transformer trained exclusively on high-quality Hungarian text. It runs on edge devices and excels in summarization, grammar checking, and tone detection.

Model Details

Parameters 137.8M
Context length 8,192 tokens (YaRN RoPE)
Architecture LLaMA-style (decoder-only transformer)
Training context 2,048 tokens
Training precision bfloat16 (MLX)
Published weights float16
Vocabulary 32,000 (SentencePiece Unigram, Hungarian)
Training data ~1B tokens of Hungarian text
License MIT

Architecture

  • 16 transformer layers
  • 768 hidden dimension
  • 12 attention heads
  • 2048 FFN intermediate size
  • RMSNorm pre-layer normalization
  • Rotary positional embeddings (RoPE) with YaRN extension
  • SwiGLU feed-forward activation
  • Tied input/output embeddings

Tokenizer

Custom 32K vocabulary SentencePiece Unigram tokenizer trained on high-quality Hungarian corpora. ~30% more token-efficient than multilingual tokenizers for Hungarian text.

Available separately: emese-tech/emese-tokenizer-32k

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("emese-tech/csermely")
model = AutoModelForCausalLM.from_pretrained("emese-tech/csermely")

input_text = "A magyar nyelv"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

The default generation config uses temperature=0.7, top_p=0.9, and repetition_penalty=1.2 to reduce repetitive output.

Citation

@misc{emese-csermely-2026,
  title={Csermely: A Tiny Hungarian Language Model},
  author={Emese Tech},
  year={2026},
  url={https://huggingface.co/emese-tech/csermely}
}
Downloads last month
40
Safetensors
Model size
0.2B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support