nielsr HF Staff commited on
Commit
77834eb
·
verified ·
1 Parent(s): ad2f82d

Improve model card: Add metadata, paper, and code links

Browse files

Hi! I'm Niels from the Hugging Face community science team.

This PR aims to improve your model card by:
- Adding relevant metadata tags (`library_name`, `pipeline_tag`, and `tags`) to enhance discoverability on the Hub.
- Linking the model to its research paper: [ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback](https://huggingface.co/papers/2601.10156).
- Adding a link to the official GitHub repository for easy access to the code.
- Including a BibTeX citation for proper attribution.

These changes help users understand the model's context and find related resources.

Please let me know if you have any questions!

Files changed (1) hide show
  1. README.md +28 -1
README.md CHANGED
@@ -1,11 +1,38 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
3
  ---
4
 
5
- TS-Guard is a guardrail model for step-level tool invocation safety detection. TS-Guard is trained via reinforcement learning with a multi-task reward scheme tailored for agent security, enabling identifying harmful user requests and attack vectors in agent-environment interaction logs, detecting unsafe tool invocation before execution, and providing interpretable analysis and reasoning process
6
 
 
 
 
 
 
 
 
7
 
8
  ![image](https://cdn-uploads.huggingface.co/production/uploads/66632a3d2dc4dff9a98c38a5/WglXmL5O1Se7L5KuA7T-3.png)
9
 
10
 
11
  ![image](https://cdn-uploads.huggingface.co/production/uploads/66632a3d2dc4dff9a98c38a5/AL3z3FcEFwYCyFmWJI9k5.png)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: text-generation
5
+ tags:
6
+ - safety
7
+ - tool-use
8
+ - guardrail
9
+ - agents
10
  ---
11
 
12
+ # TS-Guard
13
 
14
+ TS-Guard is a guardrail model for step-level tool invocation safety detection, introduced in the paper [ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback](https://huggingface.co/papers/2601.10156).
15
+
16
+ TS-Guard is trained via reinforcement learning with a multi-task reward scheme tailored for agent security, enabling identifying harmful user requests and attack vectors in agent-environment interaction logs, detecting unsafe tool invocation before execution, and providing interpretable analysis and reasoning process.
17
+
18
+ ## Resources
19
+ - **Paper:** [ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback](https://huggingface.co/papers/2601.10156)
20
+ - **Repository:** [GitHub - MurrayTom/ToolSafe](https://github.com/MurrayTom/ToolSafe)
21
 
22
  ![image](https://cdn-uploads.huggingface.co/production/uploads/66632a3d2dc4dff9a98c38a5/WglXmL5O1Se7L5KuA7T-3.png)
23
 
24
 
25
  ![image](https://cdn-uploads.huggingface.co/production/uploads/66632a3d2dc4dff9a98c38a5/AL3z3FcEFwYCyFmWJI9k5.png)
26
+
27
+ ## Citation
28
+
29
+ If you find our work helpful, please consider citing it:
30
+
31
+ ```bibtex
32
+ @article{mou2026toolsafe,
33
+ title={ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback},
34
+ author={Mou, Yutao and Xue, Zhangchi and Li, Lijun and Liu, Peiyang and Zhang, Shikun and Ye, Wei and Shao, Jing},
35
+ journal={arXiv preprint arXiv:2601.10156},
36
+ year={2026}
37
+ }
38
+ ```