MurrayTom
/

TS-Guard

Model card Files Files and versions

TS-Guard / README.md

MurrayTom's picture

Update README.md

8e05260 verified 13 days ago

|

history blame contribute delete

661 Bytes

	---
	license: apache-2.0
	---

	TS-Guard is a guardrail model for step-level tool invocation safety detection. TS-Guard is trained via reinforcement learning with a multi-task reward scheme tailored for agent security, enabling identifying harmful user requests and attack vectors in agent-environment interaction logs, detecting unsafe tool invocation before execution, and providing interpretable analysis and reasoning process


	![image](https://cdn-uploads.huggingface.co/production/uploads/66632a3d2dc4dff9a98c38a5/WglXmL5O1Se7L5KuA7T-3.png)


	![image](https://cdn-uploads.huggingface.co/production/uploads/66632a3d2dc4dff9a98c38a5/AL3z3FcEFwYCyFmWJI9k5.png)