naimulislam commited on
Commit
f609326
·
verified ·
1 Parent(s): 1cfa2ac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -5
README.md CHANGED
@@ -1,10 +1,52 @@
1
  ---
2
- title: Qwen2.5 Coder 1.5B
3
- emoji:
4
- colorFrom: blue
5
- colorTo: purple
6
  sdk: docker
 
7
  pinned: false
8
  ---
9
 
10
- # Qwen2.5 Coder 1.5B
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Qwen3-0.6B OpenAI-Compatible API
3
+ emoji: 🤖
4
+ colorFrom: purple
5
+ colorTo: blue
6
  sdk: docker
7
+ app_port: 7860
8
  pinned: false
9
  ---
10
 
11
+ # Qwen3-0.6B OpenAI-Compatible API Server
12
+
13
+ An OpenAI-compatible API server hosting the **unsloth/Qwen3-0.6B-GGUF** model with:
14
+
15
+ - ✅ **32.8K context window**
16
+ - ✅ **16K max output tokens**
17
+ - ✅ **Smooth streaming responses**
18
+ - ✅ **Tool/Function calling support**
19
+ - ✅ **Thinking modes**: `true` (enabled), `false` (disabled), `auto` (model decides)
20
+ - ✅ **Beautiful Chat UI** for testing
21
+ - ✅ **No API key required**
22
+
23
+ ## API Endpoints
24
+
25
+ | Endpoint | Method | Description |
26
+ |----------|--------|-------------|
27
+ | `/` | GET | Chat UI |
28
+ | `/v1/models` | GET | List models |
29
+ | `/v1/chat/completions` | POST | Chat completions |
30
+ | `/health` | GET | Health check |
31
+
32
+ ## Usage Example
33
+
34
+ ```python
35
+ from openai import OpenAI
36
+
37
+ client = OpenAI(
38
+ base_url="https://YOUR-SPACE.hf.space/v1",
39
+ api_key="not-needed"
40
+ )
41
+
42
+ # Basic chat
43
+ response = client.chat.completions.create(
44
+ model="qwen3-0.6b",
45
+ messages=[{"role": "user", "content": "Hello!"}],
46
+ stream=True,
47
+ extra_body={"enable_thinking": "auto"}
48
+ )
49
+
50
+ for chunk in response:
51
+ if chunk.choices[0].delta.content:
52
+ print(chunk.choices[0].delta.content, end="")