--- title: Gemme4 emoji: 💎 colorFrom: blue colorTo: purple sdk: docker pinned: false app_port: 7860 --- # Gemma 4 E2B FastAPI FastAPI wrapper around a llama.cpp server running Gemma 4 E2B Instruct (multimodal). ## Endpoints | Method | Path | Description | |--------|------|-------------| | GET | `/health` | Server health + model info | | GET | `/v1/models` | List models | | POST | `/v1/chat/completions` | OpenAI-compatible chat (streaming supported) | | POST | `/chat` | Simplified chat | | POST | `/generate` | Text generation from a prompt | | POST | `/vision` | Multimodal: text + image (URL or base64) | ## Usage ### Chat ```bash curl -X POST https:///chat \ -H "Content-Type: application/json" \ -d '{"messages": [{"role": "user", "content": "Hello!"}], "max_tokens": 512}' ``` ### Vision ```bash curl -X POST https:///vision \ -H "Content-Type: application/json" \ -d '{"prompt": "What is in this image?", "image": "https://example.com/image.jpg"}' ``` ### Streaming ```bash curl -X POST https:///chat \ -H "Content-Type: application/json" \ -d '{"messages": [{"role": "user", "content": "Tell me a story"}], "stream": true}' ```