Zichen1024 commited on
Commit
b2369c5
·
verified ·
1 Parent(s): ad38aa2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -3
README.md CHANGED
@@ -1,3 +1,64 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ <p align="center">
5
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/670a8557222579c05ec3005c/Mxt8wCxBLOs674WQ0jMIG.png" alt="CoVe Mascot" width="400"/>
6
+ </p>
7
+
8
+ <p align="center">
9
+ <a href="https://arxiv.org/abs/2603.01940">📄 Paper</a> &nbsp;|&nbsp;
10
+ <a href="https://cove-agent.github.io">🌐 Website</a> &nbsp;|&nbsp;
11
+ <a href="https://huggingface.co/datasets/Zichen1024/CoVe-12k">🤗 Dataset</a> &nbsp;|&nbsp;
12
+ </p>
13
+
14
+ ## Overview
15
+
16
+ **CoVe-4B** is a compact 4B interactive tool-use agent fine-tuned from [Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) using the **CoVe** (Constraint-Verification) post-training framework. It is trained on [CoVe-12K](https://huggingface.co/datasets/Zichen1024/CoVe-12k), a dataset of 12K high-quality multi-turn tool-use trajectories synthesized and verified by deterministic constraint checking.
17
+
18
+ ## Framework
19
+
20
+ <p style="text-align: justify;">
21
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/670a8557222579c05ec3005c/H-hAal3Fch-ibqsL0IbRs.png" alt="CoVe Framework" style="max-width: 100%"/>
22
+ <br>
23
+ <em>The CoVe framework. Explicit constraints are fuzzified to guide a User Simulator LLM, and original constraints act as a deterministic checklist to verify the agent's tool invocations.</em>
24
+ </p>
25
+
26
+ ## Performance
27
+
28
+ <p style="text-align: justify;">
29
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/670a8557222579c05ec3005c/oa_WerjgyohzU0Xt-t5gs.png" alt="Main Results Table" style="max-width: 100%"/>
30
+ <br>
31
+ <em>Main results on τ²-bench. CoVe-4B achieves top performance in the ≤8B group and rivals models up to 70B.</em>
32
+ </p>
33
+
34
+ ## Deployment and Evaluation
35
+
36
+ CoVe-4B uses the Hermes tool-call format and can be deployed with [vLLM](https://github.com/vllm-project/vllm).
37
+
38
+ ### Serve with vLLM
39
+
40
+ ```bash
41
+ CUDA_VISIBLE_DEVICES=0,1,2,3 vllm serve [MODEL_HF_URL] \
42
+ --served-model-name CoVe \
43
+ --enable-auto-tool-choice \
44
+ --tool-call-parser hermes \
45
+ --tensor-parallel-size 1 \
46
+ --data-parallel-size 4 \
47
+ --host 0.0.0.0 \
48
+ --port ${PORT}
49
+ ```
50
+
51
+ ### Evaluate with τ²-bench
52
+
53
+ Once the model is running, evaluate using the [official τ²-bench code](https://github.com/sierra-research/tau2-bench). Set the agent model to the vLLM-served CoVe endpoint.
54
+
55
+ ## Citation
56
+
57
+ ```bibtex
58
+ @article{Chen2026CoVe,
59
+ title = {CoVe: Training Interactive Tool-Use Agents via Constraint-Guided Verification},
60
+ author = {Chen, Jinpeng and Gong, Cheng and Li, Hanbo and Liu, Ziru and Tian, Zichen and Fu, Xinyu and Wu, Shi and Zhang, Chenyang and Zhang, Wu and Zhang, Suiyun and Tu, Dandan and Liu, Rui},
61
+ journal = {arXiv preprint arXiv:PLACEHOLDER_ARXIV_ID},
62
+ year = {2026}
63
+ }
64
+ ```