lovedheart commited on
Commit
96586a7
·
verified ·
1 Parent(s): 650df97

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +1 -31
  2. config.json +1 -1
README.md CHANGED
@@ -5,37 +5,7 @@ language:
5
  - zh
6
  base_model:
7
  - Qwen/Qwen3.5-2B
8
- - agentscope-ai/CoPaw-Flash-2B
9
  ---
10
-
11
- # FP8 Quantized Version
12
- Quick launch via vLLM:
13
- ```bash
14
- LLM_MEMORY_PROFILER_ESTIMATE_CUDAGRAPHS=1 \
15
- vllm serve --model ~/CoPaw-Flash-2B-FP8/ \
16
- --host 0.0.0.0 \
17
- --port 8070 \
18
- --tensor-parallel-size 1 \
19
- --max-model-len 262144 \
20
- --gpu-memory-utilization 0.92 \
21
- --trust-remote-code \
22
- --tokenizer-mode auto \
23
- --served-model-name CoPaw-Flash-2B \
24
- --max-num-batched-tokens 4096 \
25
- --max-num-seqs 1 \
26
- --enable-auto-tool-choice \
27
- --tool-call-parser qwen3_coder \
28
- --kv-cache-dtype fp8_e4m3 \
29
- --reasoning-parser qwen3 \
30
- --enable-prefix-caching \
31
- --enable-chunked-prefill
32
- ```
33
-
34
- ## Expected speed (5060Ti)
35
-
36
- ![NVIDIA_RTX_5060_TI,vllm,CoPaw-Flash-2B-FP8,20260404_215431](https://cdn-uploads.huggingface.co/production/uploads/68121d80da035a609e569a81/5hG8WD7NHM1f6RxWATZ3V.png)
37
-
38
-
39
  # CoPaw-Flash-2B
40
  **CoPaw-Flash** is a lightweight model deeply optimized for the CoPaw autonomous agent scenario. Since its training phase, the model has been specifically refined for CoPaw tasks, delivering enhanced agentic performance in tool invocation, command execution, memory management, and multi-step planning.
41
 
@@ -160,4 +130,4 @@ CoPaw-Flash is developed by the AgentScope Team. If you would like to leave us a
160
 
161
  | [Discord](https://discord.gg/eYMpfnkG8h) | [X (Twitter)](https://x.com/agentscope_ai) | [DingTalk](https://qr.dingtalk.com/action/joingroup?code=v1,k1,OmDlBXpjW+I2vWjKDsjvI9dhcXjGZi3bQiojOq3dlDw=&_dt_no_comment=1&origin=11) |
162
  | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
163
- | [<img src="https://gw.alicdn.com/imgextra/i1/O1CN01hhD1mu1Dd3BWVUvxN_!!6000000000238-2-tps-400-400.png" width="80" height="80" alt="Discord">](https://discord.gg/eYMpfnkG8h) | [<img src="https://img.alicdn.com/imgextra/i4/O1CN01c0GOsa1UTkoxAGVvZ_!!6000000002519-2-tps-225-225.png" width="80" height="80" alt="X">](https://x.com/agentscope_ai) | [<img src="https://img.alicdn.com/imgextra/i2/O1CN01vCWI8a1skHtLGXEMQ_!!6000000005804-2-tps-458-460.png" width="80" height="80" alt="DingTalk">](https://qr.dingtalk.com/action/joingroup?code=v1,k1,OmDlBXpjW+I2vWjKDsjvI9dhcXjGZi3bQiojOq3dlDw=&_dt_no_comment=1&origin=11) |
 
5
  - zh
6
  base_model:
7
  - Qwen/Qwen3.5-2B
 
8
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  # CoPaw-Flash-2B
10
  **CoPaw-Flash** is a lightweight model deeply optimized for the CoPaw autonomous agent scenario. Since its training phase, the model has been specifically refined for CoPaw tasks, delivering enhanced agentic performance in tool invocation, command execution, memory management, and multi-step planning.
11
 
 
130
 
131
  | [Discord](https://discord.gg/eYMpfnkG8h) | [X (Twitter)](https://x.com/agentscope_ai) | [DingTalk](https://qr.dingtalk.com/action/joingroup?code=v1,k1,OmDlBXpjW+I2vWjKDsjvI9dhcXjGZi3bQiojOq3dlDw=&_dt_no_comment=1&origin=11) |
132
  | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
133
+ | [<img src="https://gw.alicdn.com/imgextra/i1/O1CN01hhD1mu1Dd3BWVUvxN_!!6000000000238-2-tps-400-400.png" width="80" height="80" alt="Discord">](https://discord.gg/eYMpfnkG8h) | [<img src="https://img.alicdn.com/imgextra/i4/O1CN01c0GOsa1UTkoxAGVvZ_!!6000000002519-2-tps-225-225.png" width="80" height="80" alt="X">](https://x.com/agentscope_ai) | [<img src="https://img.alicdn.com/imgextra/i2/O1CN01vCWI8a1skHtLGXEMQ_!!6000000005804-2-tps-458-460.png" width="80" height="80" alt="DingTalk">](https://qr.dingtalk.com/action/joingroup?code=v1,k1,OmDlBXpjW+I2vWjKDsjvI9dhcXjGZi3bQiojOq3dlDw=&_dt_no_comment=1&origin=11) |
config.json CHANGED
@@ -13,7 +13,7 @@
13
  "attention_dropout": 0.0,
14
  "attn_output_gate": true,
15
  "bos_token_id": null,
16
- "dtype": "float32",
17
  "eos_token_id": 248044,
18
  "full_attention_interval": 4,
19
  "head_dim": 256,
 
13
  "attention_dropout": 0.0,
14
  "attn_output_gate": true,
15
  "bos_token_id": null,
16
+ "dtype": "bfloat16",
17
  "eos_token_id": 248044,
18
  "full_attention_interval": 4,
19
  "head_dim": 256,