LetheanNetwork/lemer-mlx-bf16
Gemma 4 E2B in MLX format, full bf16 precision, converted from
LetheanNetwork/lemer's
bf16 safetensors via mlx_lm.convert --dtype bfloat16. No quantization —
this is the full-precision reference for the MLX family. For smaller /
faster variants see
LetheanNetwork/lemer-mlx
(4-bit) or
LetheanNetwork/lemer-mlx-8bit.
For the LEK-merged variant see
lthn/lemer.
Variants in this family
| Repo | Format | Bits | Use case |
|---|---|---|---|
LetheanNetwork/lemer |
safetensors + gguf Q4_K_M | bf16 / 4 | Source weights + llama.cpp/Ollama |
LetheanNetwork/lemer-mlx |
mlx | 4 | Apple Silicon default |
LetheanNetwork/lemer-mlx-8bit |
mlx | 8 | Higher precision |
LetheanNetwork/lemer-mlx-bf16 |
mlx | bf16 | This repo — full-precision reference |
Usage
from mlx_lm import load, generate
model, tokenizer = load("LetheanNetwork/lemer-mlx-bf16")
response = generate(
model, tokenizer,
prompt=tokenizer.apply_chat_template(
[{"role": "user", "content": "Hello"}],
add_generation_prompt=True,
enable_thinking=True,
),
max_tokens=512,
)
Provenance
- Source:
LetheanNetwork/lemerbf16 safetensors (=google/gemma-4-E2B-it) - Converter:
mlx_lm.convert --dtype bfloat16(no quantization) - License: Apache 2.0 (Gemma Terms of Use)
License
Apache 2.0, subject to the Gemma Terms of Use.
- Downloads last month
- 28
Model size
5B params
Tensor type
BF16
·
Hardware compatibility
Log In to add your hardware
Quantized
Model tree for LetheanNetwork/lemer-mlx-bf16
Base model
google/gemma-4-E2B-it