--- library_name: mlx license: eupl-1.2 pipeline_tag: image-text-to-text base_model_relation: quantized tags: - gemma4 - mlx - apple-silicon - 4bit - on-device - conversational base_model: - google/gemma-4-E2B-it --- # LetheanNetwork/lemer-mlx Gemma 4 E2B in MLX format, 4-bit quantized, converted from [LetheanNetwork/lemer](https://huggingface.co/LetheanNetwork/lemer)'s bf16 safetensors via `mlx_lm.convert`. This is the unmodified Google Gemma 4 E2B-IT weights — no LEK shift, no fine-tuning — hosted in our namespace so downstream tools (benchmarks, apps) don't have to depend on external mlx-community mirrors. For the LEK-merged (consent-based ethical kernel) variant of the same model, see [`lthn/lemer`](https://huggingface.co/lthn/lemer). ## Variants in this family | Repo | Format | Bits | Use case | |---|---|---|---| | [`LetheanNetwork/lemer`](https://huggingface.co/LetheanNetwork/lemer) | safetensors + gguf Q4_K_M | bf16 / 4 | Source weights + llama.cpp/Ollama | | **`LetheanNetwork/lemer-mlx`** | mlx | 4 | **This repo** — Apple Silicon default | | [`LetheanNetwork/lemer-mlx-8bit`](https://huggingface.co/LetheanNetwork/lemer-mlx-8bit) | mlx | 8 | Apple Silicon higher-precision | | [`LetheanNetwork/lemer-mlx-bf16`](https://huggingface.co/LetheanNetwork/lemer-mlx-bf16) | mlx | bf16 | Apple Silicon full-precision reference | ## Usage ```python from mlx_lm import load, generate model, tokenizer = load("LetheanNetwork/lemer-mlx") response = generate( model, tokenizer, prompt=tokenizer.apply_chat_template( [{"role": "user", "content": "Hello"}], add_generation_prompt=True, enable_thinking=True, ), max_tokens=512, ) ``` ## Provenance - Source: `LetheanNetwork/lemer` bf16 safetensors (= `google/gemma-4-E2B-it`) - Converter: `mlx_lm.convert` (mlx-lm — LM Studio / Apple ML Research) - Quant: 4-bit group quantization, ~4.5 bits/weight effective - License: Apache 2.0 (Gemma Terms of Use) ## License Apache 2.0, subject to the [Gemma Terms of Use](https://ai.google.dev/gemma/docs/gemma_4_license).