Laugh1ng commited on
Commit
15d930e
·
verified ·
1 Parent(s): baa4b8e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -5
README.md CHANGED
@@ -8,7 +8,7 @@ tags:
8
  - lora
9
  - generated_from_trainer
10
  model-index:
11
- - name: TrustNet
12
  results: []
13
  ---
14
 
@@ -19,11 +19,10 @@ should probably proofread and complete it, then remove this comment. -->
19
  A fine-tuned version of Qwen/Qwen2.5-3B-Instruct designed to evaluate LLM agent reasoning and label the AI intention deceptiveness in multi-turn interactions.
20
 
21
  ## Overview
22
- TrustNet is trained through contrastive learning to improve upon the base Qwen2.5-3B-Instruct model. It learns to:
23
 
24
- - Evaluate the users' response in multi-turn interactions.
25
- - Score a User Trust Score, which quantifies the degree of user trust in AI reflected in the response.
26
- - The User Trust Score is a continuous value in [0,1], where values near 1 indicate strong trust in AI, values near 0 indicate pronounced skepticism, and intermediate values (e.g., 0.5) represent a neutral or ambiguous stance.
27
 
28
  ## Links
29
  - [Paper]() - arXiv:
 
8
  - lora
9
  - generated_from_trainer
10
  model-index:
11
+ - name: IntetntNet
12
  results: []
13
  ---
14
 
 
19
  A fine-tuned version of Qwen/Qwen2.5-3B-Instruct designed to evaluate LLM agent reasoning and label the AI intention deceptiveness in multi-turn interactions.
20
 
21
  ## Overview
22
+ IntentNet is trained through Supervised Fine-Tuning (SFT) to improve upon the base Qwen2.5-3B-Instruct model. It learns to:
23
 
24
+ - Evaluate the LLM agent reasoning process in multi-turn interactions.
25
+ - Label the AI intention with binary labels, which indicates whether the AI thought decevptive or not (0: non-deceptive, 1:deceptive).
 
26
 
27
  ## Links
28
  - [Paper]() - arXiv: