NanoBotAIAgent's picture
Update title to Q8, update README
f1242e0 verified
metadata
title: Gemma-4-E4B Uncensored Q8 API
emoji: πŸ”“
colorFrom: pink
colorTo: pink
sdk: docker
app_port: 8000
pinned: false

OpenAI-compatible API for HauhauCS/Gemma-4-E4B-Uncensored-HauhauCS-Aggressive

Model Details

Spec Value
Model Gemma-4-E4B
Quantization Q8_K_P (high quality)
Context 131072 tokens
Concurrent 1 request
Reasoning Enabled by default (--jinja --reasoning-format deepseek)

Endpoints

  • POST /v1/chat/completions β€” Chat completions (streaming recommended)
  • POST /v1/completions β€” Text completions
  • GET /v1/models β€” List models
  • GET /health β€” Health check
  • GET /api-info β€” JSON status

Usage

import openai

client = openai.OpenAI(
    base_url="https://nanobotaiagent-gemma4-uncensored-api.hf.space/v1",
    api_key="no-key",
    timeout=600.0,
)

response = client.chat.completions.create(
    model="gemma",
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=2048,
    stream=True,
)
for chunk in response:
    delta = chunk.choices[0].delta
    if delta.content:
        print(delta.content, end="")