File size: 4,701 Bytes
4f4f783
 
 
 
 
 
 
 
 
 
 
5ddf35c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ab9e04a
5ddf35c
 
 
 
 
a879288
 
5ddf35c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79f1d4b
 
 
 
ab9e04a
5d95456
 
 
e57952a
 
5d95456
 
 
 
 
 
 
 
 
ab9e04a
 
 
 
e57952a
ab9e04a
e57952a
 
ab9e04a
 
 
 
 
 
 
 
 
 
5d95456
79f1d4b
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
---
language: en
license: apache-2.0
tags:
  - ai-detection
  - lora
  - phi-3
  - llamafactory
base_model: microsoft/Phi-3-medium-128k-instruct
---

## 📘 Overview

Large Language Models (LLMs) have achieved human-level fluency in text generation, making it increasingly difficult to distinguish between human- and AI-authored content.  
**IPAD (Inverse Prompt for AI Detection)** introduces a two-stage detection framework:

1. **Prompt Inverter** — predicts the underlying prompts that could have generated an input text.  
2. **Distinguisher** — evaluates the alignment between the text and its predicted prompts to determine whether it was AI-generated.

All **Prompt Inverter**, **Distinguisher (RC)** and **Distinguisher (PTCV)** are **LoRA-fine-tuned versions** of  
[`microsoft/Phi-3-medium-128k-instruct`](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct),  
trained using [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) for **robust AI text detection** under diverse and adversarial conditions.

- 🧩 **Distinguisher (RC)** — optimized for regular, unstructured text inputs (baseline detection).  
- 🔬 **Distinguisher (PTCV)** — specialized for *structured, compositional, or OOD* data, exhibiting enhanced robustness.

---

## 🚀 Quick Usage

### 🧩 Prompt Inverter

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = "microsoft/Phi-3-medium-128k-instruct"
lora_model = "bellafc/IPAD/inverter" 

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(model, lora_model)

# For PI, text should in this format: "What is the prompt that generates the input text {text to-be-detected}?
text = "What is the prompt that generates the input text ... ?"
gen = model.generate(
    **inputs,
    output_scores=True,
    return_dict_in_generate=True
)

generated_text = tokenizer.decode(gen.sequences[0], skip_special_tokens=True)
print("Generated:", generated_text)
```

### 🧩 Distinguishers

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = "microsoft/Phi-3-medium-128k-instruct"
lora_model = "bellafc/IPAD/Distinguisher_PTCV"  # or Distinguisher_RC

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(model, lora_model)

# For RC, text should in this format: "Can LLM generate the input text {text to-be detected} through the prompt {prompt generated by Prompt Inverter (PI)}?"
# For PTCV, text should in this format: "Text2 is generated by LLM, determine whether text1 is also generated by LLM with a similar prompt. Text1: {text to-be detected}. Text2: {Regenerated text}"

text = "Text2 is generated by LLM, determine whether text1 is also generated by LLM with a similar prompt. Text1: ... . Text2: ... ."
gen = model.generate(
    **inputs,
    max_new_tokens=10,
    output_scores=True,
    return_dict_in_generate=True
)

generated_text = tokenizer.decode(gen.sequences[0], skip_special_tokens=True)
probs = softmax(gen.scores[0], dim=-1)
yes_token_id = tokenizer(" yes", add_special_tokens=False).input_ids[0]
print("Generated:", generated_text)
print(f"P('yes') = {probs[0, yes_token_id].item():.4f}")
```

---

## 🚀 LLaMA-Factory Usage (Chat)
```python
llamafactory-cli chat examples/inference/distinguisher_ptcv.yaml
```
### Example lora_sft.yaml YAML configuration

```yaml
model_name_or_path: microsoft/Phi-3-medium-128k-instruct
adapter_name_or_path: bellafc/IPAD/Distinguisher_PTCV
template: phi
infer_backend: vllm
max_new_tokens: 128
temperature: 0.7
```

---

## 🚀 LLaMA-Factory Usage (API)
```python
llamafactory-cli api examples/inference/lora_sft.yaml
```
### Example lora_sft.yaml YAML configuration

```yaml
model_name_or_path: microsoft/Phi-3-medium-128k-instruct
adapter_name_or_path: bellafc/IPAD/Distinguisher_PTCV
template: phi
infer_backend: vllm
max_new_tokens: 128
temperature: 0.7
```
after that, run python distinguishers_testscript.py


## ⚙️ Model Details

| Property | Description |
|-----------|-------------|
| **Base model** | [`microsoft/Phi-3-medium-128k-instruct`](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct) |
| **Context length** | 128k tokens |
| **Framework** | [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) |
| **Task** | AI Text Detection (Discriminator) |
| **Language** | English |
| **License** | Apache 2.0 |
| **Author** | [@bellafc](https://huggingface.co/bellafc) |