File size: 1,638 Bytes
18ae114 77834eb 18ae114 77834eb 8e05260 77834eb 8e05260 77834eb |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
---
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
tags:
- safety
- tool-use
- guardrail
- agents
---
# TS-Guard
TS-Guard is a guardrail model for step-level tool invocation safety detection, introduced in the paper [ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback](https://huggingface.co/papers/2601.10156).
TS-Guard is trained via reinforcement learning with a multi-task reward scheme tailored for agent security, enabling identifying harmful user requests and attack vectors in agent-environment interaction logs, detecting unsafe tool invocation before execution, and providing interpretable analysis and reasoning process.
## Resources
- **Paper:** [ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback](https://huggingface.co/papers/2601.10156)
- **Repository:** [GitHub - MurrayTom/ToolSafe](https://github.com/MurrayTom/ToolSafe)


## Citation
If you find our work helpful, please consider citing it:
```bibtex
@article{mou2026toolsafe,
title={ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback},
author={Mou, Yutao and Xue, Zhangchi and Li, Lijun and Liu, Peiyang and Zhang, Shikun and Ye, Wei and Shao, Jing},
journal={arXiv preprint arXiv:2601.10156},
year={2026}
}
``` |