File size: 1,638 Bytes
18ae114
 
77834eb
 
 
 
 
 
 
18ae114
 
77834eb
8e05260
77834eb
 
 
 
 
 
 
8e05260
 
 
 
 
77834eb
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
---
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
tags:
- safety
- tool-use
- guardrail
- agents
---

# TS-Guard

TS-Guard is a guardrail model for step-level tool invocation safety detection, introduced in the paper [ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback](https://huggingface.co/papers/2601.10156).

TS-Guard is trained via reinforcement learning with a multi-task reward scheme tailored for agent security, enabling identifying harmful user requests and attack vectors in agent-environment interaction logs, detecting unsafe tool invocation before execution, and providing interpretable analysis and reasoning process.

## Resources
- **Paper:** [ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback](https://huggingface.co/papers/2601.10156)
- **Repository:** [GitHub - MurrayTom/ToolSafe](https://github.com/MurrayTom/ToolSafe)

![image](https://cdn-uploads.huggingface.co/production/uploads/66632a3d2dc4dff9a98c38a5/WglXmL5O1Se7L5KuA7T-3.png)


![image](https://cdn-uploads.huggingface.co/production/uploads/66632a3d2dc4dff9a98c38a5/AL3z3FcEFwYCyFmWJI9k5.png)

## Citation

If you find our work helpful, please consider citing it:

```bibtex
@article{mou2026toolsafe,
  title={ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback},
  author={Mou, Yutao and Xue, Zhangchi and Li, Lijun and Liu, Peiyang and Zhang, Shikun and Ye, Wei and Shao, Jing},
  journal={arXiv preprint arXiv:2601.10156},
  year={2026}
}
```