gemme4 / README.md
d3evil4's picture
feat: huh
a3ec6a2
metadata
title: Gemme4
emoji: 💎
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
app_port: 7860

Gemma 4 E2B FastAPI

FastAPI wrapper around a llama.cpp server running Gemma 4 E2B Instruct (multimodal).

Endpoints

Method Path Description
GET /health Server health + model info
GET /v1/models List models
POST /v1/chat/completions OpenAI-compatible chat (streaming supported)
POST /chat Simplified chat
POST /generate Text generation from a prompt
POST /vision Multimodal: text + image (URL or base64)

Usage

Chat

curl -X POST https://<space-url>/chat \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Hello!"}], "max_tokens": 512}'

Vision

curl -X POST https://<space-url>/vision \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is in this image?", "image": "https://example.com/image.jpg"}'

Streaming

curl -X POST https://<space-url>/chat \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Tell me a story"}], "stream": true}'