TenaOS / README.md
ClinicDx's picture
Initial: Gemma 4 E4B Instruct BF16 GGUF + mmproj for TenaOS
b758bc0
---
license: gemma
tags:
- tenaos
- gemma
- gguf
- llama.cpp
- clinical
language:
- en
base_model: google/gemma-4-E4B-it
pipeline_tag: text-generation
---
# TenaOS — Gemma 4 E4B Instruct (BF16 GGUF)
`llama.cpp`-ready BF16 conversion of
[`google/gemma-4-E4B-it`](https://huggingface.co/google/gemma-4-E4B-it),
plus the audio `mmproj` projector. Used by
[TenaOS](https://github.com/brookyale0512/TenaOS) for on-device clinical
inference (text + voice, multimodal in a single pass).
## Contents
| File | Size | Purpose |
| --- | --- | --- |
| `gemma-4-E4B-it-BF16.gguf` | ~15 GB | Full-precision GGUF for generation |
| `mmproj-gemma-4-E4B-it-bf16.gguf` | ~946 MB | Multimodal projector for audio input |
We standardize on **BF16 full precision**. No quantization in the
production path.
## Usage
```bash
hf download beza4588/TenaOS --local-dir ./models
# launch llama-server (CUDA build):
llama-server \
-m ./models/gemma-4-E4B-it-BF16.gguf \
--mmproj ./models/mmproj-gemma-4-E4B-it-bf16.gguf \
--host 0.0.0.0 --port 8000 -ngl 99 --jinja --alias gemma-4
```
In TenaOS the docker image bind-mounts this directory at `/models`; see
[`scripts/fetch-models.sh`](https://github.com/brookyale0512/TenaOS/blob/main/scripts/fetch-models.sh).
## License
Inherits the [Gemma Terms of Use](https://ai.google.dev/gemma/terms).
TenaOS packaging is Apache 2.0.