Update pipeline tag and add library metadata
Browse filesHi! I'm Niels from the Hugging Face community team.
I've opened this PR to improve the model card metadata and discoverability:
- Updated the `pipeline_tag` to `text-generation` as this is a paraphrasing model (LLM).
- Added `library_name: peft` since this is a LoRA adapter, which enables the "Use in Transformers" button to show the correct code snippets.
- Added the ArXiv ID to the metadata to link this model to its research paper.
- Included the paper authors for better attribution.
Let me know if you have any questions!
README.md
CHANGED
|
@@ -1,17 +1,22 @@
|
|
| 1 |
---
|
| 2 |
-
|
|
|
|
| 3 |
datasets:
|
| 4 |
- yaful/MAGE
|
| 5 |
language:
|
| 6 |
- en
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
|
|
|
| 10 |
---
|
|
|
|
| 11 |
# StealthRL LoRA Adapter for Qwen3-4B-Instruct (PEFT)
|
| 12 |
|
| 13 |
This repository hosts a **LoRA (Low-Rank Adaptation) adapter** for the base model
|
| 14 |
-
**Qwen/Qwen3-4B-Instruct-2507**,
|
|
|
|
|
|
|
| 15 |
|
| 16 |
It is an **adapter-only** release (PEFT). The full base model is not included and must be downloaded separately from Hugging Face.
|
| 17 |
|
|
@@ -21,14 +26,14 @@ It is an **adapter-only** release (PEFT). The full base model is not included an
|
|
| 21 |
|
| 22 |
**StealthRL** is a reinforcement learning framework for generating **adversarial paraphrases** that evade **multiple AI-text detectors** while preserving semantics.
|
| 23 |
|
| 24 |
-
|
| 25 |
- StealthRL trains a **paraphrase policy** against a **multi-detector ensemble**
|
| 26 |
- Uses **Group Relative Policy Optimization (GRPO)** with **LoRA adapters** on **Qwen3-4B**
|
| 27 |
- Optimizes a **composite reward** that balances **detector evasion** with **semantic preservation**
|
| 28 |
- Evaluates transfer to a **held-out detector family**, suggesting shared vulnerabilities rather than detector-specific brittleness
|
| 29 |
|
| 30 |
-
Paper: https://arxiv.org/abs/2602.08934
|
| 31 |
-
Code: https://github.com/suraj-ranganath/StealthRL
|
| 32 |
|
| 33 |
---
|
| 34 |
|
|
@@ -57,7 +62,7 @@ from peft import PeftModel
|
|
| 57 |
import torch
|
| 58 |
|
| 59 |
base_model = "Qwen/Qwen3-4B-Instruct-2507"
|
| 60 |
-
adapter_repo = "
|
| 61 |
|
| 62 |
tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
|
| 63 |
model = AutoModelForCausalLM.from_pretrained(
|
|
@@ -107,8 +112,8 @@ print(tokenizer.decode(out[0], skip_special_tokens=True))
|
|
| 107 |
|
| 108 |
## Associated Paper and Code
|
| 109 |
|
| 110 |
-
- **Paper (arXiv)**: https://arxiv.org/abs/2602.08934
|
| 111 |
-
- **GitHub Repository**: https://github.com/suraj-ranganath/StealthRL
|
| 112 |
|
| 113 |
---
|
| 114 |
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model:
|
| 3 |
+
- Qwen/Qwen3-4B-Instruct-2507
|
| 4 |
datasets:
|
| 5 |
- yaful/MAGE
|
| 6 |
language:
|
| 7 |
- en
|
| 8 |
+
license: mit
|
| 9 |
+
pipeline_tag: text-generation
|
| 10 |
+
library_name: peft
|
| 11 |
+
arxiv: 2602.08934
|
| 12 |
---
|
| 13 |
+
|
| 14 |
# StealthRL LoRA Adapter for Qwen3-4B-Instruct (PEFT)
|
| 15 |
|
| 16 |
This repository hosts a **LoRA (Low-Rank Adaptation) adapter** for the base model
|
| 17 |
+
**Qwen/Qwen3-4B-Instruct-2507**, presented in the paper [StealthRL: Reinforcement Learning Paraphrase Attacks for Multi-Detector Evasion of AI-Text Detectors](https://huggingface.co/papers/2602.08934).
|
| 18 |
+
|
| 19 |
+
The authors of the paper are Suraj Ranganath and Atharv Ramesh.
|
| 20 |
|
| 21 |
It is an **adapter-only** release (PEFT). The full base model is not included and must be downloaded separately from Hugging Face.
|
| 22 |
|
|
|
|
| 26 |
|
| 27 |
**StealthRL** is a reinforcement learning framework for generating **adversarial paraphrases** that evade **multiple AI-text detectors** while preserving semantics.
|
| 28 |
|
| 29 |
+
Key contributions from the paper:
|
| 30 |
- StealthRL trains a **paraphrase policy** against a **multi-detector ensemble**
|
| 31 |
- Uses **Group Relative Policy Optimization (GRPO)** with **LoRA adapters** on **Qwen3-4B**
|
| 32 |
- Optimizes a **composite reward** that balances **detector evasion** with **semantic preservation**
|
| 33 |
- Evaluates transfer to a **held-out detector family**, suggesting shared vulnerabilities rather than detector-specific brittleness
|
| 34 |
|
| 35 |
+
Paper: [https://arxiv.org/abs/2602.08934](https://arxiv.org/abs/2602.08934)
|
| 36 |
+
Code: [https://github.com/suraj-ranganath/StealthRL](https://github.com/suraj-ranganath/StealthRL)
|
| 37 |
|
| 38 |
---
|
| 39 |
|
|
|
|
| 62 |
import torch
|
| 63 |
|
| 64 |
base_model = "Qwen/Qwen3-4B-Instruct-2507"
|
| 65 |
+
adapter_repo = "suraj-ranganath/StealthRL-Qwen3-4B-LORA" # Update with actual repo path if different
|
| 66 |
|
| 67 |
tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
|
| 68 |
model = AutoModelForCausalLM.from_pretrained(
|
|
|
|
| 112 |
|
| 113 |
## Associated Paper and Code
|
| 114 |
|
| 115 |
+
- **Paper (arXiv)**: [https://arxiv.org/abs/2602.08934](https://arxiv.org/abs/2602.08934)
|
| 116 |
+
- **GitHub Repository**: [https://github.com/suraj-ranganath/StealthRL](https://github.com/suraj-ranganath/StealthRL)
|
| 117 |
|
| 118 |
---
|
| 119 |
|