| title: Gemma-4-E4B Uncensored Q8 API | |
| emoji: π | |
| colorFrom: pink | |
| colorTo: pink | |
| sdk: docker | |
| app_port: 8000 | |
| pinned: false | |
| OpenAI-compatible API for [HauhauCS/Gemma-4-E4B-Uncensored-HauhauCS-Aggressive](https://huggingface.co/HauhauCS/Gemma-4-E4B-Uncensored-HauhauCS-Aggressive) | |
| ## Model Details | |
| | Spec | Value | | |
| |------|-------| | |
| | Model | Gemma-4-E4B | | |
| | Quantization | Q8_K_P (high quality) | | |
| | Context | 131072 tokens | | |
| | Concurrent | 1 request | | |
| | Reasoning | Enabled by default (`--jinja --reasoning-format deepseek`) | | |
| ## Endpoints | |
| - `POST /v1/chat/completions` β Chat completions (streaming recommended) | |
| - `POST /v1/completions` β Text completions | |
| - `GET /v1/models` β List models | |
| - `GET /health` β Health check | |
| - `GET /api-info` β JSON status | |
| ## Usage | |
| ```python | |
| import openai | |
| client = openai.OpenAI( | |
| base_url="https://nanobotaiagent-gemma4-uncensored-api.hf.space/v1", | |
| api_key="no-key", | |
| timeout=600.0, | |
| ) | |
| response = client.chat.completions.create( | |
| model="gemma", | |
| messages=[{"role": "user", "content": "Hello!"}], | |
| max_tokens=2048, | |
| stream=True, | |
| ) | |
| for chunk in response: | |
| delta = chunk.choices[0].delta | |
| if delta.content: | |
| print(delta.content, end="") | |
| ``` | |