TenaOS / README.md
ClinicDx's picture
Initial: Gemma 4 E4B Instruct BF16 GGUF + mmproj for TenaOS
b758bc0
metadata
license: gemma
tags:
  - tenaos
  - gemma
  - gguf
  - llama.cpp
  - clinical
language:
  - en
base_model: google/gemma-4-E4B-it
pipeline_tag: text-generation

TenaOS — Gemma 4 E4B Instruct (BF16 GGUF)

llama.cpp-ready BF16 conversion of google/gemma-4-E4B-it, plus the audio mmproj projector. Used by TenaOS for on-device clinical inference (text + voice, multimodal in a single pass).

Contents

File Size Purpose
gemma-4-E4B-it-BF16.gguf ~15 GB Full-precision GGUF for generation
mmproj-gemma-4-E4B-it-bf16.gguf ~946 MB Multimodal projector for audio input

We standardize on BF16 full precision. No quantization in the production path.

Usage

hf download beza4588/TenaOS --local-dir ./models
# launch llama-server (CUDA build):
llama-server \
    -m ./models/gemma-4-E4B-it-BF16.gguf \
    --mmproj ./models/mmproj-gemma-4-E4B-it-bf16.gguf \
    --host 0.0.0.0 --port 8000 -ngl 99 --jinja --alias gemma-4

In TenaOS the docker image bind-mounts this directory at /models; see scripts/fetch-models.sh.

License

Inherits the Gemma Terms of Use. TenaOS packaging is Apache 2.0.