TS-Guard / README.md
MurrayTom's picture
Update README.md
8e05260 verified
metadata
license: apache-2.0

TS-Guard is a guardrail model for step-level tool invocation safety detection. TS-Guard is trained via reinforcement learning with a multi-task reward scheme tailored for agent security, enabling identifying harmful user requests and attack vectors in agent-environment interaction logs, detecting unsafe tool invocation before execution, and providing interpretable analysis and reasoning process

image

image