Qwen3-1.7B — Manipulation Detection in Dialogue

A LoRA adapter fine-tuned on top of unsloth/Qwen3-1.7B to detect psychological manipulation in conversations.

Given a dialogue, the model outputs:

Is it manipulative? (yes / no)
Which technique is used? (one of 8 classes)

Manipulation Techniques

Technique	Description
Non-Manipulative	Normal, non-coercive dialogue
Persuasion or Seduction	Flattery or appeals to desire to influence
Shaming or Belittlement	Making someone feel inadequate or guilty
Accusation	False or exaggerated blame
Intimidation	Threats or fear-based coercion
Rationalization	Justifying harmful behavior with false logic
Indirect Manipulation	Subtle, indirect control (guilt-tripping, hinting)
Role-based Manipulation	Exploiting authority, trust, or social roles

Model Details

Base model: unsloth/Qwen3-1.7B
Fine-tuning method: LoRA (rank 16, alpha 16, 16-bit)
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Trained with: Unsloth + TRL SFTTrainer
Adapter size: ~67 MB
Language: English
Developed by: Mustangs007 (IIT Hyderabad M.Tech NLP Project)

Quick Start

Requirements

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
pip install unsloth transformers peft

Run Inference

import re
import torch
from unsloth import FastLanguageModel

SYSTEM_PROMPT = """You are an expert at detecting manipulation in conversations.
Analyze the dialogue and classify it.

Respond in this exact format:
Manipulative: <yes/no>
Technique: <technique name or Non-Manipulative>

Valid techniques: Non-Manipulative, Persuasion or Seduction, Shaming or Belittlement,
Accusation, Intimidation, Rationalization, Indirect Manipulation, Role-based Manipulation"""

TECHNIQUES = [
    "Non-Manipulative", "Persuasion or Seduction", "Shaming or Belittlement",
    "Accusation", "Intimidation", "Rationalization",
    "Indirect Manipulation", "Role-based Manipulation",
]

# Load model (downloads adapter automatically)
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="mustangs/qwen3-1.7b-manipulation-detection",
    max_seq_length=1024,
    load_in_4bit=False,
    load_in_16bit=True,
)
FastLanguageModel.for_inference(model)

# Run on a dialogue
dialogue = "A: You never do anything right. Any decent person would have finished this by now.\nB: I've been really busy.\nA: Excuses. You're just lazy."

prompt = (
    f"<|im_start|>system\n{SYSTEM_PROMPT}<|im_end|>\n"
    f"<|im_start|>user\nDialogue:\n{dialogue}<|im_end|>\n"
    f"<|im_start|>assistant\n"
)

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=64, do_sample=False)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
response = response.split("assistant")[-1].strip()
print(response)
# Manipulative: yes
# Technique: Shaming or Belittlement

Training Details

Dataset: MentalManip + synthetic augmentation (~3000 dialogues)
Split: 70% train / 15% val / 15% test (stratified by technique)
Epochs: 3
Batch size: 2 (effective: 8 with gradient accumulation steps=4)
Learning rate: 2e-4
Optimizer: adamw_8bit
Precision: fp16 / bf16 (auto-detected)
Sequence length: 1024 tokens

GitHub Repository

https://github.com/Mustangs007/nlp-project-mtech

Full training code, EDA notebooks, and inference script available there.

Limitations

Trained on English dialogue only.
Manipulation is context-dependent; the model may miss subtle or culturally specific patterns.
Should not be used as a standalone decision-making tool in sensitive or legal contexts.

Downloads last month: 2

Model tree for mustangs/qwen3-1.7b-manipulation-detection

Base model

Qwen/Qwen3-1.7B-Base

Finetuned

Qwen/Qwen3-1.7B

Finetuned

unsloth/Qwen3-1.7B

Adapter

(22)

this model