Spaces:

naimulislam
/

Qwen3-Coder-0.6B

Paused

naimulislam commited on Feb 25

Commit

f609326

verified ·

1 Parent(s): 1cfa2ac

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,10 +1,52 @@
 ---
-title: Qwen2.5 Coder 1.5B
-emoji: ⚡
-colorFrom: blue
-colorTo: purple
 sdk: docker
 pinned: false
 ---
-# Qwen2.5 Coder 1.5B

 ---
+title: Qwen3-0.6B OpenAI-Compatible API
+emoji: 🤖
+colorFrom: purple
+colorTo: blue
 sdk: docker
+app_port: 7860
 pinned: false
 ---
+# Qwen3-0.6B OpenAI-Compatible API Server
+An OpenAI-compatible API server hosting the **unsloth/Qwen3-0.6B-GGUF** model with:
+- ✅ **32.8K context window**
+- ✅ **16K max output tokens**
+- ✅ **Smooth streaming responses**
+- ✅ **Tool/Function calling support**
+- ✅ **Thinking modes**: `true` (enabled), `false` (disabled), `auto` (model decides)
+- ✅ **Beautiful Chat UI** for testing
+- ✅ **No API key required**
+## API Endpoints
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/` | GET | Chat UI |
+| `/v1/models` | GET | List models |
+| `/v1/chat/completions` | POST | Chat completions |
+| `/health` | GET | Health check |
+## Usage Example
+```python
+from openai import OpenAI
+client = OpenAI(
+    base_url="https://YOUR-SPACE.hf.space/v1",
+    api_key="not-needed"
+)
+# Basic chat
+response = client.chat.completions.create(
+    model="qwen3-0.6b",
+    messages=[{"role": "user", "content": "Hello!"}],
+    stream=True,
+    extra_body={"enable_thinking": "auto"}
+)
+for chunk in response:
+    if chunk.choices[0].delta.content:
+        print(chunk.choices[0].delta.content, end="")