Image-Text-to-Text
MLX
Safetensors
gemma4
lemma
8bit
apple-silicon
multimodal
on-device
conversational
Instructions to use lthn/lemma-mlx-8bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use lthn/lemma-mlx-8bit with MLX:
# Make sure mlx-vlm is installed # pip install --upgrade mlx-vlm from mlx_vlm import load, generate from mlx_vlm.prompt_utils import apply_chat_template from mlx_vlm.utils import load_config # Load the model model, processor = load("lthn/lemma-mlx-8bit") config = load_config("lthn/lemma-mlx-8bit") # Prepare input image = ["http://images.cocodataset.org/val2017/000000039769.jpg"] prompt = "Describe this image." # Apply chat template formatted_prompt = apply_chat_template( processor, config, prompt, num_images=1 ) # Generate output output = generate(model, processor, formatted_prompt, image) print(output) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
- Pi new
How to use lthn/lemma-mlx-8bit with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "lthn/lemma-mlx-8bit"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "lthn/lemma-mlx-8bit" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use lthn/lemma-mlx-8bit with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "lthn/lemma-mlx-8bit"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default lthn/lemma-mlx-8bit
Run Hermes
hermes
Snider Virgil commited on
Commit Β·
fa76a33
1
Parent(s): d296bbb
docs: correct base_model lineage for HF model tree
Browse filesHF uses base_model + base_model_relation frontmatter to rank models in
search results and render the model tree widget. The Lemma family's
true lineage is:
google/gemma-4-*-it
βββ LetheanNetwork/<m> (finetune β our namespace fork)
βββ lthn/<m> (finetune β LEK merged into weights)
βββ lthn/<m>-mlx (quantized β mlx 4/8bit/bf16)
Previously this repo had base_model_relation set to quantized, which
was wrong β LEK merging is a finetune, not a quant. Fixing so the
model tree widget ranks the family correctly.
Co-Authored-By: Virgil <virgil@lethean.io>
README.md
CHANGED
|
@@ -1,7 +1,37 @@
|
|
| 1 |
---
|
| 2 |
-
language: en
|
| 3 |
library_name: mlx
|
| 4 |
pipeline_tag: image-text-to-text
|
| 5 |
tags:
|
|
|
|
|
|
|
| 6 |
- mlx
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
library_name: mlx
|
| 3 |
pipeline_tag: image-text-to-text
|
| 4 |
tags:
|
| 5 |
+
- gemma4
|
| 6 |
+
- lemma
|
| 7 |
- mlx
|
| 8 |
+
- 8bit
|
| 9 |
+
- apple-silicon
|
| 10 |
+
- multimodal
|
| 11 |
+
- on-device
|
| 12 |
+
- conversational
|
| 13 |
+
license: eupl-1.2
|
| 14 |
+
license_link: https://ai.google.dev/gemma/docs/gemma_4_license
|
| 15 |
+
base_model:
|
| 16 |
+
- lthn/lemma
|
| 17 |
+
base_model_relation: quantized
|
| 18 |
---
|
| 19 |
+
|
| 20 |
+
# Lemma β Gemma 4 E4B β MLX 8-bit
|
| 21 |
+
|
| 22 |
+
The mid-sized member of the Lemma model family by Lethean. An EUPL-1.2 fork of Gemma 4 E4B with the Lethean Ethical Kernel (LEK) merged into the weights.
|
| 23 |
+
|
| 24 |
+
This repo hosts the **MLX 8-bit** build for native Apple Silicon inference via [`mlx-lm`](https://github.com/ml-explore/mlx-lm) and [`mlx-vlm`](https://github.com/Blaizzy/mlx-vlm). For the GGUF playground (Ollama, llama.cpp) see [`lthn/lemma`](https://huggingface.co/lthn/lemma). For the unmodified Google base see [`LetheanNetwork/lemma`](https://huggingface.co/LetheanNetwork/lemma).
|
| 25 |
+
|
| 26 |
+
## Family
|
| 27 |
+
|
| 28 |
+
| Repo | Format | Bits |
|
| 29 |
+
|---|---|---|
|
| 30 |
+
| [`lthn/lemma`](https://huggingface.co/lthn/lemma) | GGUF multi-quant | Q4_K_M β BF16 |
|
| 31 |
+
| [`lthn/lemma-mlx`](https://huggingface.co/lthn/lemma-mlx) | MLX | 4-bit |
|
| 32 |
+
| [`lthn/lemma-mlx-8bit`](https://huggingface.co/lthn/lemma-mlx-8bit) | MLX | 8-bit |
|
| 33 |
+
| [`lthn/lemma-mlx-bf16`](https://huggingface.co/lthn/lemma-mlx-bf16) | MLX | bf16 |
|
| 34 |
+
|
| 35 |
+
## License
|
| 36 |
+
|
| 37 |
+
EUPL-1.2. See [Gemma Terms of Use](https://ai.google.dev/gemma/docs/gemma_4_license) for upstream base model terms.
|