LetheanNetwork
/

lemer-mlx

Image-Text-to-Text

4-bit precision

Model card Files Files and versions

lemer-mlx / README.md

lthn's picture

Update README.md

729be9c verified 5 days ago

|

history blame contribute delete

2.08 kB

	---
	library_name: mlx
	license: eupl-1.2
	pipeline_tag: image-text-to-text
	base_model_relation: quantized
	tags:
	- gemma4
	- mlx
	- apple-silicon
	- 4bit
	- on-device
	- conversational
	base_model:
	- google/gemma-4-E2B-it
	---

	# LetheanNetwork/lemer-mlx

	Gemma 4 E2B in MLX format, 4-bit quantized, converted from
	[LetheanNetwork/lemer](https://huggingface.co/LetheanNetwork/lemer)'s
	bf16 safetensors via `mlx_lm.convert`. This is the unmodified Google
	Gemma 4 E2B-IT weights — no LEK shift, no fine-tuning — hosted in our
	namespace so downstream tools (benchmarks, apps) don't have to depend
	on external mlx-community mirrors.

	For the LEK-merged (consent-based ethical kernel) variant of the same
	model, see [`lthn/lemer`](https://huggingface.co/lthn/lemer).

	## Variants in this family

	\| Repo \| Format \| Bits \| Use case \|
	\|---\|---\|---\|---\|
	\| [`LetheanNetwork/lemer`](https://huggingface.co/LetheanNetwork/lemer) \| safetensors + gguf Q4_K_M \| bf16 / 4 \| Source weights + llama.cpp/Ollama \|
	\| `LetheanNetwork/lemer-mlx` \| mlx \| 4 \| This repo — Apple Silicon default \|
	\| [`LetheanNetwork/lemer-mlx-8bit`](https://huggingface.co/LetheanNetwork/lemer-mlx-8bit) \| mlx \| 8 \| Apple Silicon higher-precision \|
	\| [`LetheanNetwork/lemer-mlx-bf16`](https://huggingface.co/LetheanNetwork/lemer-mlx-bf16) \| mlx \| bf16 \| Apple Silicon full-precision reference \|

	## Usage

	```python
	from mlx_lm import load, generate

	model, tokenizer = load("LetheanNetwork/lemer-mlx")
	response = generate(
	model, tokenizer,
	prompt=tokenizer.apply_chat_template(
	[{"role": "user", "content": "Hello"}],
	add_generation_prompt=True,
	enable_thinking=True,
	),
	max_tokens=512,
	)
	```

	## Provenance

	- Source: `LetheanNetwork/lemer` bf16 safetensors (= `google/gemma-4-E2B-it`)
	- Converter: `mlx_lm.convert` (mlx-lm — LM Studio / Apple ML Research)
	- Quant: 4-bit group quantization, ~4.5 bits/weight effective
	- License: Apache 2.0 (Gemma Terms of Use)

	## License

	Apache 2.0, subject to the [Gemma Terms of Use](https://ai.google.dev/gemma/docs/gemma_4_license).