| --- |
| language: en |
| license: apache-2.0 |
| tags: |
| - ai-detection |
| - lora |
| - phi-3 |
| - llamafactory |
| base_model: microsoft/Phi-3-medium-128k-instruct |
| --- |
| |
| ## 📘 Overview |
|
|
| Large Language Models (LLMs) have achieved human-level fluency in text generation, making it increasingly difficult to distinguish between human- and AI-authored content. |
| **IPAD (Inverse Prompt for AI Detection)** introduces a two-stage detection framework: |
|
|
| 1. **Prompt Inverter** — predicts the underlying prompts that could have generated an input text. |
| 2. **Distinguisher** — evaluates the alignment between the text and its predicted prompts to determine whether it was AI-generated. |
|
|
| All **Prompt Inverter**, **Distinguisher (RC)** and **Distinguisher (PTCV)** are **LoRA-fine-tuned versions** of |
| [`microsoft/Phi-3-medium-128k-instruct`](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct), |
| trained using [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) for **robust AI text detection** under diverse and adversarial conditions. |
|
|
| - 🧩 **Distinguisher (RC)** — optimized for regular, unstructured text inputs (baseline detection). |
| - 🔬 **Distinguisher (PTCV)** — specialized for *structured, compositional, or OOD* data, exhibiting enhanced robustness. |
|
|
| --- |
|
|
| ## 🚀 Quick Usage |
|
|
| ### 🧩 Prompt Inverter |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| from peft import PeftModel |
| |
| base_model = "microsoft/Phi-3-medium-128k-instruct" |
| lora_model = "bellafc/IPAD/inverter" |
| |
| tokenizer = AutoTokenizer.from_pretrained(base_model) |
| model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype="auto", device_map="auto") |
| model = PeftModel.from_pretrained(model, lora_model) |
| |
| # For PI, text should in this format: "What is the prompt that generates the input text {text to-be-detected}? |
| text = "What is the prompt that generates the input text ... ?" |
| gen = model.generate( |
| **inputs, |
| output_scores=True, |
| return_dict_in_generate=True |
| ) |
| |
| generated_text = tokenizer.decode(gen.sequences[0], skip_special_tokens=True) |
| print("Generated:", generated_text) |
| ``` |
|
|
| ### 🧩 Distinguishers |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| from peft import PeftModel |
| |
| base_model = "microsoft/Phi-3-medium-128k-instruct" |
| lora_model = "bellafc/IPAD/Distinguisher_PTCV" # or Distinguisher_RC |
| |
| tokenizer = AutoTokenizer.from_pretrained(base_model) |
| model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype="auto", device_map="auto") |
| model = PeftModel.from_pretrained(model, lora_model) |
| |
| # For RC, text should in this format: "Can LLM generate the input text {text to-be detected} through the prompt {prompt generated by Prompt Inverter (PI)}?" |
| # For PTCV, text should in this format: "Text2 is generated by LLM, determine whether text1 is also generated by LLM with a similar prompt. Text1: {text to-be detected}. Text2: {Regenerated text}" |
| |
| text = "Text2 is generated by LLM, determine whether text1 is also generated by LLM with a similar prompt. Text1: ... . Text2: ... ." |
| gen = model.generate( |
| **inputs, |
| max_new_tokens=10, |
| output_scores=True, |
| return_dict_in_generate=True |
| ) |
| |
| generated_text = tokenizer.decode(gen.sequences[0], skip_special_tokens=True) |
| probs = softmax(gen.scores[0], dim=-1) |
| yes_token_id = tokenizer(" yes", add_special_tokens=False).input_ids[0] |
| print("Generated:", generated_text) |
| print(f"P('yes') = {probs[0, yes_token_id].item():.4f}") |
| ``` |
|
|
| --- |
|
|
| ## 🚀 LLaMA-Factory Usage (Chat) |
| ```python |
| llamafactory-cli chat examples/inference/distinguisher_ptcv.yaml |
| ``` |
| ### Example lora_sft.yaml YAML configuration |
| |
| ```yaml |
| model_name_or_path: microsoft/Phi-3-medium-128k-instruct |
| adapter_name_or_path: bellafc/IPAD/Distinguisher_PTCV |
| template: phi |
| infer_backend: vllm |
| max_new_tokens: 128 |
| temperature: 0.7 |
| ``` |
| |
| --- |
| |
| ## 🚀 LLaMA-Factory Usage (API) |
| ```python |
| llamafactory-cli api examples/inference/lora_sft.yaml |
| ``` |
| ### Example lora_sft.yaml YAML configuration |
| |
| ```yaml |
| model_name_or_path: microsoft/Phi-3-medium-128k-instruct |
| adapter_name_or_path: bellafc/IPAD/Distinguisher_PTCV |
| template: phi |
| infer_backend: vllm |
| max_new_tokens: 128 |
| temperature: 0.7 |
| ``` |
| after that, run python distinguishers_testscript.py |
| |
| |
| ## ⚙️ Model Details |
| |
| | Property | Description | |
| |-----------|-------------| |
| | **Base model** | [`microsoft/Phi-3-medium-128k-instruct`](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct) | |
| | **Context length** | 128k tokens | |
| | **Framework** | [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) | |
| | **Task** | AI Text Detection (Discriminator) | |
| | **Language** | English | |
| | **License** | Apache 2.0 | |
| | **Author** | [@bellafc](https://huggingface.co/bellafc) | |
| |