nielsr HF Staff commited on
Commit
08d98ea
·
verified ·
1 Parent(s): 71d265c

Update pipeline tag and add library metadata

Browse files

Hi! I'm Niels from the Hugging Face community team.

I've opened this PR to improve the model card metadata and discoverability:
- Updated the `pipeline_tag` to `text-generation` as this is a paraphrasing model (LLM).
- Added `library_name: peft` since this is a LoRA adapter, which enables the "Use in Transformers" button to show the correct code snippets.
- Added the ArXiv ID to the metadata to link this model to its research paper.
- Included the paper authors for better attribution.

Let me know if you have any questions!

Files changed (1) hide show
  1. README.md +16 -11
README.md CHANGED
@@ -1,17 +1,22 @@
1
  ---
2
- license: mit
 
3
  datasets:
4
  - yaful/MAGE
5
  language:
6
  - en
7
- base_model:
8
- - Qwen/Qwen3-4B-Instruct-2507
9
- pipeline_tag: reinforcement-learning
 
10
  ---
 
11
  # StealthRL LoRA Adapter for Qwen3-4B-Instruct (PEFT)
12
 
13
  This repository hosts a **LoRA (Low-Rank Adaptation) adapter** for the base model
14
- **Qwen/Qwen3-4B-Instruct-2507**, trained using the **StealthRL** methodology.
 
 
15
 
16
  It is an **adapter-only** release (PEFT). The full base model is not included and must be downloaded separately from Hugging Face.
17
 
@@ -21,14 +26,14 @@ It is an **adapter-only** release (PEFT). The full base model is not included an
21
 
22
  **StealthRL** is a reinforcement learning framework for generating **adversarial paraphrases** that evade **multiple AI-text detectors** while preserving semantics.
23
 
24
- From the paper:
25
  - StealthRL trains a **paraphrase policy** against a **multi-detector ensemble**
26
  - Uses **Group Relative Policy Optimization (GRPO)** with **LoRA adapters** on **Qwen3-4B**
27
  - Optimizes a **composite reward** that balances **detector evasion** with **semantic preservation**
28
  - Evaluates transfer to a **held-out detector family**, suggesting shared vulnerabilities rather than detector-specific brittleness
29
 
30
- Paper: https://arxiv.org/abs/2602.08934
31
- Code: https://github.com/suraj-ranganath/StealthRL
32
 
33
  ---
34
 
@@ -57,7 +62,7 @@ from peft import PeftModel
57
  import torch
58
 
59
  base_model = "Qwen/Qwen3-4B-Instruct-2507"
60
- adapter_repo = "YOUR_HF_USERNAME/YOUR_ADAPTER_REPO"
61
 
62
  tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
63
  model = AutoModelForCausalLM.from_pretrained(
@@ -107,8 +112,8 @@ print(tokenizer.decode(out[0], skip_special_tokens=True))
107
 
108
  ## Associated Paper and Code
109
 
110
- - **Paper (arXiv)**: https://arxiv.org/abs/2602.08934
111
- - **GitHub Repository**: https://github.com/suraj-ranganath/StealthRL
112
 
113
  ---
114
 
 
1
  ---
2
+ base_model:
3
+ - Qwen/Qwen3-4B-Instruct-2507
4
  datasets:
5
  - yaful/MAGE
6
  language:
7
  - en
8
+ license: mit
9
+ pipeline_tag: text-generation
10
+ library_name: peft
11
+ arxiv: 2602.08934
12
  ---
13
+
14
  # StealthRL LoRA Adapter for Qwen3-4B-Instruct (PEFT)
15
 
16
  This repository hosts a **LoRA (Low-Rank Adaptation) adapter** for the base model
17
+ **Qwen/Qwen3-4B-Instruct-2507**, presented in the paper [StealthRL: Reinforcement Learning Paraphrase Attacks for Multi-Detector Evasion of AI-Text Detectors](https://huggingface.co/papers/2602.08934).
18
+
19
+ The authors of the paper are Suraj Ranganath and Atharv Ramesh.
20
 
21
  It is an **adapter-only** release (PEFT). The full base model is not included and must be downloaded separately from Hugging Face.
22
 
 
26
 
27
  **StealthRL** is a reinforcement learning framework for generating **adversarial paraphrases** that evade **multiple AI-text detectors** while preserving semantics.
28
 
29
+ Key contributions from the paper:
30
  - StealthRL trains a **paraphrase policy** against a **multi-detector ensemble**
31
  - Uses **Group Relative Policy Optimization (GRPO)** with **LoRA adapters** on **Qwen3-4B**
32
  - Optimizes a **composite reward** that balances **detector evasion** with **semantic preservation**
33
  - Evaluates transfer to a **held-out detector family**, suggesting shared vulnerabilities rather than detector-specific brittleness
34
 
35
+ Paper: [https://arxiv.org/abs/2602.08934](https://arxiv.org/abs/2602.08934)
36
+ Code: [https://github.com/suraj-ranganath/StealthRL](https://github.com/suraj-ranganath/StealthRL)
37
 
38
  ---
39
 
 
62
  import torch
63
 
64
  base_model = "Qwen/Qwen3-4B-Instruct-2507"
65
+ adapter_repo = "suraj-ranganath/StealthRL-Qwen3-4B-LORA" # Update with actual repo path if different
66
 
67
  tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
68
  model = AutoModelForCausalLM.from_pretrained(
 
112
 
113
  ## Associated Paper and Code
114
 
115
+ - **Paper (arXiv)**: [https://arxiv.org/abs/2602.08934](https://arxiv.org/abs/2602.08934)
116
+ - **GitHub Repository**: [https://github.com/suraj-ranganath/StealthRL](https://github.com/suraj-ranganath/StealthRL)
117
 
118
  ---
119