Instructions to use emese-tech/csermely-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use emese-tech/csermely-mlx with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # if on a CUDA device, also pip install mlx[cuda] # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("emese-tech/csermely-mlx") prompt = "Once upon a time in" text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
- MLX LM
How to use emese-tech/csermely-mlx with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Generate some text mlx_lm.generate --model "emese-tech/csermely-mlx" --prompt "Once upon a time"
Csermely (MLX)
MLX version of Csermely — a 138M parameter Hungarian language model optimized for Apple Silicon. Part of the Emese model family.
This is the native MLX bfloat16 checkpoint. For the HuggingFace transformers version, see emese-tech/csermely.
Model Details
| Parameters | 137.8M |
| Architecture | LLaMA-style (decoder-only transformer) |
| Context length | 8,192 tokens (YaRN RoPE) |
| Training context | 2,048 tokens |
| Precision | bfloat16 |
| Vocabulary | 32,000 (SentencePiece Unigram, Hungarian) |
| Training data | ~1B tokens of Hungarian text |
| Framework | MLX (Apple Silicon) |
| License | MIT |
Architecture
- 16 transformer layers
- 768 hidden dimension
- 12 attention heads
- 2048 FFN intermediate size
- RMSNorm pre-layer normalization
- Rotary positional embeddings (RoPE) with YaRN extension
- SwiGLU feed-forward activation
- Tied input/output embeddings
Usage
import mlx.core as mx
from model import Emese, ModelConfig
config = ModelConfig()
model = Emese(config)
model.load_weights("model.safetensors")
- Downloads last month
- 35
Model size
0.1B params
Tensor type
BF16
·
Hardware compatibility
Log In to add your hardware
Quantized