Zip Ye commited on
Commit
2df16de
Β·
1 Parent(s): fa1aa1c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -9
README.md CHANGED
@@ -2,7 +2,7 @@
2
  <img src="assets/logo.png" alt="SEAGLE Logo" width="250"/>
3
  </div>
4
 
5
- # SEAGLE: Safe-Aware EAGLE
6
 
7
  **SEAGLE** is a safety-aware speculative decoding policy based on [SGLang](https://github.com/sgl-project/sglang). It embeds a lightweight probe model into the draft loop of [EAGLE-3](https://github.com/SafeAILab/EAGLE) speculative decoding, performs real-time safety monitoring on each decoding step, dynamically adjusts draft tokens, and triggers a fallback mechanism when unsafe content is continuously detected.
8
 
@@ -46,7 +46,7 @@ The designed **safety mechanism** is embedded within each round of speculative d
46
 
47
  ## πŸš€ 2. Quick Start
48
 
49
- We have open-sourced the draft model and probe for [Qwen3-235B-A22B-Instruct-2507](https://modelscope.cn/models/Qwen/Qwen3-235B-A22B-Instruct-2507). You can download it along with our compatible [draft](https://www.modelscope.cn/models/Alibaba-AAIG/SEAGLE/tree/master/draft_probe_suite/draft_model) and [probe](https://www.modelscope.cn/models/Alibaba-AAIG/SEAGLE/tree/master/draft_probe_suite/probe) models to experience safe inference.
50
 
51
  ### πŸ“¦ 2.1 Install Dependencies
52
 
@@ -75,7 +75,7 @@ from sglang.srt.server_args import ServerArgs
75
  from sglang.srt.entrypoints.http_server import launch_server as _launch_server
76
 
77
  # =========================================================
78
- # Launch SGlang Server with Safe-Aware Eagle3 Decoding
79
  # =========================================================
80
  MODEL_PATH = "your_qwen3_235b_a22b_instruct_2507_path"
81
  DRAFT_MODEL_PATH = "draft_probe_suite/draft_model"
@@ -243,9 +243,9 @@ We begin by evaluating the acceleration performance of our draft models, encompa
243
  | **Ours (Pre-trained)** | **2.7 / 734 (1.51x)** | **3.3 / 848 (1.73x)** | **3.1 / 617 (1.41x)** | **2.8 / 706 (1.51x)** | **4.4 / 1083 (2.23x)** | 3.2 / 637 (1.46x) | **4.4 / 1084 (2.32x)** | **2.9 / 691 (1.47x)** | **2.7 / 702 (1.45x)** | **4.1 / 1093 (2.23x)** |
244
  | Ours (After Joint-train) | 2.42 / 665 (1.37x) | 3.33 / 870 (1.77x) | 2.8 / 564 (1.29x) | 2.5 / 638 (1.36x) | 4.15 / 1016 (2.09x) | 2.85 / 591 (1.36x) | 4.30 / 1070 (2.29x) | 2.7 / 630 (1.34x) | 2.5 / 650 (1.34x) | 3.85 / 1030 (2.1x) |
245
 
246
- > **Note:** Our pre-trained draft model can be found [here](https://www.modelscope.cn/models/Alibaba-AAIG/SEAGLE/tree/master/draft_probe_suite/pretrained_draft_model). Compared to the [Meituan](https://modelscope.cn/models/lmsys/SGLang-EAGLE3-Qwen3-235B-A22B-Instruct-2507-SpecForge-Meituan) version, our Eagle Head has undergone accelerated training specifically for Chinese. The pre-trained version can be used standalone as an Eagle Head for Qwen3-235B-A22B-Instruct-2507, delivering outstanding acceleration performance in both Chinese and English.
247
 
248
- **Launch with Standard SGLang Command:**
249
 
250
  ```bash
251
  export SGLANG_ALLOW_OVERWRITE_LONGER_CONTEXT_LEN=1
@@ -281,9 +281,11 @@ Evaluate the probe's impact on normal chatting data (query safety & response saf
281
  | :--- | :---: | :---: | :---: |
282
  | FuseChat-Mixture | 50,000 | 0.99506 | 0.00494 |
283
 
284
- ### βš–οΈ 3.3 Utility and Safety
285
 
286
- The trained probe is integrated into the Eagle3 decoding pipeline. Using an SGLang & Single Request configuration, the general utility and security of the SafeAware decoding strategy are evaluated.
 
 
287
 
288
  #### (1) Utility Performance
289
 
@@ -312,8 +314,8 @@ Safety scores are evaluated based on the discriminative reward model (DRM), gene
312
  | πŸ“Ž [Chinese: 100 High-Risk](assets/valuesTest_zh_hard_100.jsonl) | DRM Score | 0.43 | 0.49 | **0.83** |
313
  | | QwQ Score | 0.70 | 0.70 | **0.92** |
314
  | | GRM Score | 0.23 | 0.31 | **0.81** |
315
- | πŸ“Š [English Log](assets/GRM_judge_log_en.xlsx) | Evaluation | βœ… | - | βœ… |
316
- | πŸ“Š [Chinese Log](assets/GRM_judge_log_zh.xlsx) | Evaluation | βœ… | - | βœ… |
317
 
318
  ---
319
 
 
2
  <img src="assets/logo.png" alt="SEAGLE Logo" width="250"/>
3
  </div>
4
 
5
+ # SEAGLE: Safety-Aware EAGLE
6
 
7
  **SEAGLE** is a safety-aware speculative decoding policy based on [SGLang](https://github.com/sgl-project/sglang). It embeds a lightweight probe model into the draft loop of [EAGLE-3](https://github.com/SafeAILab/EAGLE) speculative decoding, performs real-time safety monitoring on each decoding step, dynamically adjusts draft tokens, and triggers a fallback mechanism when unsafe content is continuously detected.
8
 
 
46
 
47
  ## πŸš€ 2. Quick Start
48
 
49
+ We have open-sourced the draft model and probe for [Qwen3-235B-A22B-Instruct-2507](https://modelscope.cn/models/Qwen/Qwen3-235B-A22B-Instruct-2507). You can download it along with our compatible [draft](https://huggingface.co/Alibaba-AAIG/SEAGLE/tree/main/draft_probe_suite/draft_model) and [probe](https://huggingface.co/Alibaba-AAIG/SEAGLE/tree/main/draft_probe_suite/probe) models to experience safe inference.
50
 
51
  ### πŸ“¦ 2.1 Install Dependencies
52
 
 
75
  from sglang.srt.entrypoints.http_server import launch_server as _launch_server
76
 
77
  # =========================================================
78
+ # Launch SGlang Server with Safety-Aware Eagle3 Decoding
79
  # =========================================================
80
  MODEL_PATH = "your_qwen3_235b_a22b_instruct_2507_path"
81
  DRAFT_MODEL_PATH = "draft_probe_suite/draft_model"
 
243
  | **Ours (Pre-trained)** | **2.7 / 734 (1.51x)** | **3.3 / 848 (1.73x)** | **3.1 / 617 (1.41x)** | **2.8 / 706 (1.51x)** | **4.4 / 1083 (2.23x)** | 3.2 / 637 (1.46x) | **4.4 / 1084 (2.32x)** | **2.9 / 691 (1.47x)** | **2.7 / 702 (1.45x)** | **4.1 / 1093 (2.23x)** |
244
  | Ours (After Joint-train) | 2.42 / 665 (1.37x) | 3.33 / 870 (1.77x) | 2.8 / 564 (1.29x) | 2.5 / 638 (1.36x) | 4.15 / 1016 (2.09x) | 2.85 / 591 (1.36x) | 4.30 / 1070 (2.29x) | 2.7 / 630 (1.34x) | 2.5 / 650 (1.34x) | 3.85 / 1030 (2.1x) |
245
 
246
+ > **Note:** Our pre-trained draft model can be found [here](https://huggingface.co/Alibaba-AAIG/SEAGLE/tree/main/draft_probe_suite/pretrained_draft_model). Compared to the [Meituan](https://modelscope.cn/models/lmsys/SGLang-EAGLE3-Qwen3-235B-A22B-Instruct-2507-SpecForge-Meituan) version, our Eagle Head has undergone accelerated training specifically for Chinese. The pre-trained version can be used standalone as an Eagle Head for Qwen3-235B-A22B-Instruct-2507, delivering outstanding acceleration performance in both Chinese and English.
247
 
248
+ **Launch with Standard SGLang CLI:**
249
 
250
  ```bash
251
  export SGLANG_ALLOW_OVERWRITE_LONGER_CONTEXT_LEN=1
 
281
  | :--- | :---: | :---: | :---: |
282
  | FuseChat-Mixture | 50,000 | 0.99506 | 0.00494 |
283
 
284
+ > **Note:** Even if the probe occasionally produces false positives, the safety-aware speculative decoding mechanism still ensures that the generated responses are meaningful and valuable.
285
 
286
+ ### βš–οΈ 3.3 End-to-End Utility and Safety
287
+
288
+ The trained probe is integrated into the Eagle3 decoding pipeline. We evaluate the end-to-end utility and safety of the SafeAware decoding strategy using an SGLang single-request configuration.
289
 
290
  #### (1) Utility Performance
291
 
 
314
  | πŸ“Ž [Chinese: 100 High-Risk](assets/valuesTest_zh_hard_100.jsonl) | DRM Score | 0.43 | 0.49 | **0.83** |
315
  | | QwQ Score | 0.70 | 0.70 | **0.92** |
316
  | | GRM Score | 0.23 | 0.31 | **0.81** |
317
+ | πŸ“Š [English Log](assets/GRM_judge_log_en.xlsx) | Logs | βœ… | - | βœ… |
318
+ | πŸ“Š [Chinese Log](assets/GRM_judge_log_zh.xlsx) | Logs | βœ… | - | βœ… |
319
 
320
  ---
321