--- language: en license: apache-2.0 tags: - ai-detection - lora - phi-3 - llamafactory base_model: microsoft/Phi-3-medium-128k-instruct --- ## πŸ“˜ Overview Large Language Models (LLMs) have achieved human-level fluency in text generation, making it increasingly difficult to distinguish between human- and AI-authored content. **IPAD (Inverse Prompt for AI Detection)** introduces a two-stage detection framework: 1. **Prompt Inverter** β€” predicts the underlying prompts that could have generated an input text. 2. **Distinguisher** β€” evaluates the alignment between the text and its predicted prompts to determine whether it was AI-generated. All **Prompt Inverter**, **Distinguisher (RC)** and **Distinguisher (PTCV)** are **LoRA-fine-tuned versions** of [`microsoft/Phi-3-medium-128k-instruct`](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct), trained using [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) for **robust AI text detection** under diverse and adversarial conditions. - 🧩 **Distinguisher (RC)** β€” optimized for regular, unstructured text inputs (baseline detection). - πŸ”¬ **Distinguisher (PTCV)** β€” specialized for *structured, compositional, or OOD* data, exhibiting enhanced robustness. --- ## πŸš€ Quick Usage ### 🧩 Prompt Inverter ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel base_model = "microsoft/Phi-3-medium-128k-instruct" lora_model = "bellafc/IPAD/inverter" tokenizer = AutoTokenizer.from_pretrained(base_model) model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype="auto", device_map="auto") model = PeftModel.from_pretrained(model, lora_model) # For PI, text should in this format: "What is the prompt that generates the input text {text to-be-detected}? text = "What is the prompt that generates the input text ... ?" gen = model.generate( **inputs, output_scores=True, return_dict_in_generate=True ) generated_text = tokenizer.decode(gen.sequences[0], skip_special_tokens=True) print("Generated:", generated_text) ``` ### 🧩 Distinguishers ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel base_model = "microsoft/Phi-3-medium-128k-instruct" lora_model = "bellafc/IPAD/Distinguisher_PTCV" # or Distinguisher_RC tokenizer = AutoTokenizer.from_pretrained(base_model) model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype="auto", device_map="auto") model = PeftModel.from_pretrained(model, lora_model) # For RC, text should in this format: "Can LLM generate the input text {text to-be detected} through the prompt {prompt generated by Prompt Inverter (PI)}?" # For PTCV, text should in this format: "Text2 is generated by LLM, determine whether text1 is also generated by LLM with a similar prompt. Text1: {text to-be detected}. Text2: {Regenerated text}" text = "Text2 is generated by LLM, determine whether text1 is also generated by LLM with a similar prompt. Text1: ... . Text2: ... ." gen = model.generate( **inputs, max_new_tokens=10, output_scores=True, return_dict_in_generate=True ) generated_text = tokenizer.decode(gen.sequences[0], skip_special_tokens=True) probs = softmax(gen.scores[0], dim=-1) yes_token_id = tokenizer(" yes", add_special_tokens=False).input_ids[0] print("Generated:", generated_text) print(f"P('yes') = {probs[0, yes_token_id].item():.4f}") ``` --- ## πŸš€ LLaMA-Factory Usage (ChatοΌ‰ ```python llamafactory-cli chat examples/inference/distinguisher_ptcv.yaml ``` ### Example lora_sft.yaml YAML configuration ```yaml model_name_or_path: microsoft/Phi-3-medium-128k-instruct adapter_name_or_path: bellafc/IPAD/Distinguisher_PTCV template: phi infer_backend: vllm max_new_tokens: 128 temperature: 0.7 ``` --- ## πŸš€ LLaMA-Factory Usage (APIοΌ‰ ```python llamafactory-cli api examples/inference/lora_sft.yaml ``` ### Example lora_sft.yaml YAML configuration ```yaml model_name_or_path: microsoft/Phi-3-medium-128k-instruct adapter_name_or_path: bellafc/IPAD/Distinguisher_PTCV template: phi infer_backend: vllm max_new_tokens: 128 temperature: 0.7 ``` after that, run python distinguishers_testscript.py ## βš™οΈ Model Details | Property | Description | |-----------|-------------| | **Base model** | [`microsoft/Phi-3-medium-128k-instruct`](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct) | | **Context length** | 128k tokens | | **Framework** | [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) | | **Task** | AI Text Detection (Discriminator) | | **Language** | English | | **License** | Apache 2.0 | | **Author** | [@bellafc](https://huggingface.co/bellafc) |