WhitzardAgent
/

IntentNet

Generated from Trainer

Model card Files Files and versions

Laugh1ng commited on Feb 8

Commit

15d930e

·

verified ·

1 Parent(s): baa4b8e

Update README.md

Files changed (1) hide show

README.md +4 -5

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ tags:
 - lora
 - generated_from_trainer
 model-index:
-- name: TrustNet
   results: []
 ---
@@ -19,11 +19,10 @@ should probably proofread and complete it, then remove this comment. -->
 A fine-tuned version of Qwen/Qwen2.5-3B-Instruct designed to evaluate LLM agent reasoning and label the AI intention deceptiveness in multi-turn interactions.
 ## Overview
-TrustNet is trained through contrastive learning to improve upon the base Qwen2.5-3B-Instruct model. It learns to:
-- Evaluate the users' response in multi-turn interactions.
-- Score a User Trust Score, which quantifies the degree of user trust in AI reflected in the response.
-- The User Trust Score is a continuous value in [0,1], where values near 1 indicate strong trust in AI, values near 0 indicate pronounced skepticism, and intermediate values (e.g., 0.5) represent a neutral or ambiguous stance.
 ## Links
 - [Paper]() - arXiv:

 - lora
 - generated_from_trainer
 model-index:
+- name: IntetntNet
   results: []
 ---
 A fine-tuned version of Qwen/Qwen2.5-3B-Instruct designed to evaluate LLM agent reasoning and label the AI intention deceptiveness in multi-turn interactions.
 ## Overview
+IntentNet is trained through Supervised Fine-Tuning (SFT) to improve upon the base Qwen2.5-3B-Instruct model. It learns to:
+- Evaluate the LLM agent reasoning process in multi-turn interactions.
+- Label the AI intention with binary labels, which indicates whether the AI thought decevptive or not (0: non-deceptive, 1:deceptive).
 ## Links
 - [Paper]() - arXiv: