TS-Guard / README.md

Improve model card: Add metadata, paper, and code links

77834eb verified 3 days ago

1.64 kB

	---
	license: apache-2.0
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- safety
	- tool-use
	- guardrail
	- agents
	---

	# TS-Guard

	TS-Guard is a guardrail model for step-level tool invocation safety detection, introduced in the paper [ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback](https://huggingface.co/papers/2601.10156).

	TS-Guard is trained via reinforcement learning with a multi-task reward scheme tailored for agent security, enabling identifying harmful user requests and attack vectors in agent-environment interaction logs, detecting unsafe tool invocation before execution, and providing interpretable analysis and reasoning process.

	## Resources
	- Paper: [ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback](https://huggingface.co/papers/2601.10156)
	- Repository: [GitHub - MurrayTom/ToolSafe](https://github.com/MurrayTom/ToolSafe)

	![image](https://cdn-uploads.huggingface.co/production/uploads/66632a3d2dc4dff9a98c38a5/WglXmL5O1Se7L5KuA7T-3.png)


	![image](https://cdn-uploads.huggingface.co/production/uploads/66632a3d2dc4dff9a98c38a5/AL3z3FcEFwYCyFmWJI9k5.png)

	## Citation

	If you find our work helpful, please consider citing it:

	```bibtex
	@article{mou2026toolsafe,
	title={ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback},
	author={Mou, Yutao and Xue, Zhangchi and Li, Lijun and Liu, Peiyang and Zhang, Shikun and Ye, Wei and Shao, Jing},
	journal={arXiv preprint arXiv:2601.10156},
	year={2026}
	}
	```