emese-tech
/

csermely

Text Generation

text-generation-inference

Model card Files Files and versions

csermely / README.md

gyopak's picture

Remove MLX tags from model card

264b1a0 verified 5 days ago

|

history blame contribute delete

2.26 kB

	---
	language:
	- hu
	license: mit
	tags:
	- hungarian
	- causal-lm
	- llama
	- sentencepiece
	library_name: transformers
	pipeline_tag: text-generation
	model-index:
	- name: csermely
	results: []
	---

	# Csermely

	The smallest coherent Hungarian language model. Part of the [Emese](https://emese.tech) model family.

	Csermely is a 138M parameter decoder-only transformer trained exclusively on high-quality Hungarian text. It runs on edge devices and excels in summarization, grammar checking, and tone detection.

	## Model Details

	\| \| \|
	\|---\|---\|
	\| Parameters \| 137.8M \|
	\| Context length \| 8,192 tokens (YaRN RoPE) \|
	\| Architecture \| LLaMA-style (decoder-only transformer) \|
	\| Training context \| 2,048 tokens \|
	\| Training precision \| bfloat16 (MLX) \|
	\| Published weights \| float16 \|
	\| Vocabulary \| 32,000 (SentencePiece Unigram, Hungarian) \|
	\| Training data \| ~1B tokens of Hungarian text \|
	\| License \| MIT \|

	## Architecture

	- 16 transformer layers
	- 768 hidden dimension
	- 12 attention heads
	- 2048 FFN intermediate size
	- RMSNorm pre-layer normalization
	- Rotary positional embeddings (RoPE) with YaRN extension
	- SwiGLU feed-forward activation
	- Tied input/output embeddings

	## Tokenizer

	Custom 32K vocabulary SentencePiece Unigram tokenizer trained on high-quality Hungarian corpora. ~30% more token-efficient than multilingual tokenizers for Hungarian text.

	Available separately: [emese-tech/emese-tokenizer-32k](https://huggingface.co/emese-tech/emese-tokenizer-32k)

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("emese-tech/csermely")
	model = AutoModelForCausalLM.from_pretrained("emese-tech/csermely")

	input_text = "A magyar nyelv"
	inputs = tokenizer(input_text, return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=100)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	The default generation config uses `temperature=0.7`, `top_p=0.9`, and `repetition_penalty=1.2` to reduce repetitive output.

	## Citation

	```bibtex
	@misc{emese-csermely-2026,
	title={Csermely: A Tiny Hungarian Language Model},
	author={Emese Tech},
	year={2026},
	url={https://huggingface.co/emese-tech/csermely}
	}
	```