Spaces:
Sleeping
Sleeping
Commit
Β·
4706b45
1
Parent(s):
9af4b77
Update README: Focus on CourseGPT-Pro router checkpoints
Browse files
README.md
CHANGED
|
@@ -39,9 +39,9 @@ A modern, user-friendly Gradio interface for **token-streaming, chat-style infer
|
|
| 39 |
- **Dynamic system prompts** with automatic date insertion
|
| 40 |
|
| 41 |
### π― Model Variety
|
| 42 |
-
-
|
| 43 |
-
-
|
| 44 |
-
-
|
| 45 |
- **Efficient model loading** - one at a time with automatic cache clearing
|
| 46 |
|
| 47 |
### βοΈ Advanced Controls
|
|
@@ -52,26 +52,8 @@ A modern, user-friendly Gradio interface for **token-streaming, chat-style infer
|
|
| 52 |
|
| 53 |
## π Supported Models
|
| 54 |
|
| 55 |
-
|
| 56 |
-
- **
|
| 57 |
-
- **SmolLM2-360M-Instruct** - Lightweight conversation
|
| 58 |
-
- **Taiwan-ELM-270M/1.1B** - Multilingual support
|
| 59 |
-
- **Qwen3-0.6B/1.7B** - Fast inference
|
| 60 |
-
|
| 61 |
-
### Mid-Size Models (2B-8B)
|
| 62 |
-
- **Qwen3-4B/8B** - Balanced performance
|
| 63 |
-
- **Phi-4-mini** (4.3B) - Reasoning & Instruct variants
|
| 64 |
-
- **MiniCPM3-4B** - Efficient mid-size
|
| 65 |
-
- **Gemma-3-4B-IT** - Instruction-tuned
|
| 66 |
-
- **Llama-3.2-Taiwan-3B** - Regional optimization
|
| 67 |
-
- **Mistral-7B-Instruct** - Classic performer
|
| 68 |
-
- **DeepSeek-R1-Distill-Llama-8B** - Reasoning specialist
|
| 69 |
-
|
| 70 |
-
### Large Models (14B+)
|
| 71 |
-
- **Qwen3-14B** - Strong general purpose
|
| 72 |
-
- **Apriel-1.5-15b-Thinker** - Multimodal reasoning
|
| 73 |
-
- **gpt-oss-20b** - Open GPT-style
|
| 74 |
-
- **Qwen3-32B** - Top-tier performance
|
| 75 |
|
| 76 |
## π How It Works
|
| 77 |
|
|
|
|
| 39 |
- **Dynamic system prompts** with automatic date insertion
|
| 40 |
|
| 41 |
### π― Model Variety
|
| 42 |
+
- Purpose-built around **CourseGPT-Pro router checkpoints**
|
| 43 |
+
- Two curated options: **Router-Qwen3-32B (8-bit)** and **Router-Gemma3-27B (8-bit)**
|
| 44 |
+
- Both ship with the same JSON routing schema for math/code/general orchestration
|
| 45 |
- **Efficient model loading** - one at a time with automatic cache clearing
|
| 46 |
|
| 47 |
### βοΈ Advanced Controls
|
|
|
|
| 52 |
|
| 53 |
## π Supported Models
|
| 54 |
|
| 55 |
+
- **Router-Qwen3-32B-8bit** β Qwen3 32B base with CourseGPT-Pro routing LoRA merged and quantized for ZeroGPU. Best overall accuracy with modest latency.
|
| 56 |
+
- **Router-Gemma3-27B-8bit** β Gemma3 27B base with the same router head, also in 8-bit. Slightly faster warm-up with a Gemma inductive bias that sometimes helps math-first prompts.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 57 |
|
| 58 |
## π How It Works
|
| 59 |
|