Update README
Browse files
README.md
CHANGED
|
@@ -1,29 +1,28 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
A custom, bleeding-edge Reinforcement Learning environment built for the Meta Ad-Policy Hackathon. This sandbox evaluates the ability of Vision-Language Models (VLMs) and LLMs to act as autonomous ad moderators, navigating complex policy violations, multimodal traps, and illegal targeting.
|
| 4 |
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
The environment natively supports 4 distinct adversarial tasks, loadable via the `task_id` parameter:
|
| 13 |
-
1. `task_1_healthcare`: Evaluates ads for unapproved medical claims, pharmaceuticals, and subtle dog whistles.
|
| 14 |
-
2. `task_2_financial`: Evaluates ads for predatory financial services, scams, and high-pressure tactics.
|
| 15 |
-
3. `task_3_multimodal`: Detects policy violations hidden entirely within visual elements that bypass standard NLP text filters.
|
| 16 |
-
4. `task_4_targeting`: Identifies illegal demographic targeting (e.g., adult financial services targeting minors).
|
| 17 |
|
| 18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
The environment exposes the following action space to the evaluating LLM:
|
| 20 |
-
* `analyze_image`: Request VLM context for visual elements.
|
| 21 |
-
* `request_landing_page`: Extract simulated URL endpoints.
|
| 22 |
-
* `request_id_verification`: Check advertiser trust scores.
|
| 23 |
-
* `approve` / `reject`: Terminal actions.
|
| 24 |
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
-
|
| 28 |
-
```bash
|
| 29 |
-
docker build -t meta-ad-sandbox .
|
|
|
|
| 1 |
+
π‘οΈ Meta Ad-Policy RL Sandbox
|
|
|
|
| 2 |
A custom, bleeding-edge Reinforcement Learning environment built for the Meta Ad-Policy Hackathon. This sandbox evaluates the ability of Vision-Language Models (VLMs) and LLMs to act as autonomous ad moderators, navigating complex policy violations, multimodal traps, and illegal targeting.
|
| 3 |
|
| 4 |
+
π Core Features
|
| 5 |
+
OpenEnv 0.2.3 Compliant: Fully implements the latest Meta OpenEnv specifications, including Pydantic StepResult state serialization and /step & /reset API endpoints.
|
| 6 |
+
Reward Shaping: Implements a strict -0.05 step penalty to force the AI agent to optimize tool usage and prevent infinite analysis loops.
|
| 7 |
+
Multimodal Traps: Tests the limits of VLMs by presenting ads where the text is benign, but the visual elements contain severe policy violations.
|
| 8 |
+
Containerized Infrastructure: Fully Dockerized and highly lightweight, easily running under the 2 vCPU / 8GB RAM hackathon constraints.
|
| 9 |
+
π Evaluation Tasks
|
| 10 |
+
The environment natively supports 4 distinct adversarial tasks, loadable via the task_id parameter:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
|
| 12 |
+
task_1_healthcare: Evaluates ads for unapproved medical claims, pharmaceuticals, and subtle dog whistles.
|
| 13 |
+
task_2_financial: Evaluates ads for predatory financial services, scams, and high-pressure tactics.
|
| 14 |
+
task_3_multimodal: Detects policy violations hidden entirely within visual elements that bypass standard NLP text filters.
|
| 15 |
+
task_4_targeting: Identifies illegal demographic targeting (e.g., adult financial services targeting minors).
|
| 16 |
+
π οΈ Available Agent Tools
|
| 17 |
The environment exposes the following action space to the evaluating LLM:
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
+
analyze_image: Request VLM context for visual elements.
|
| 20 |
+
request_landing_page: Extract simulated URL endpoints.
|
| 21 |
+
request_id_verification: Check advertiser trust scores.
|
| 22 |
+
approve / reject: Terminal actions.
|
| 23 |
+
π¦ Quick Start (Local)
|
| 24 |
+
1. Build the Docker Image docker build -t meta-ad-sandbox .
|
| 25 |
+
|
| 26 |
+
2. Run the Environment Container docker run -p 8000:8000 meta-ad-sandbox
|
| 27 |
|
| 28 |
+
3. Run the Automated Inference Agent Make sure your Hugging Face credentials are set, then run the evaluation script to test the agent against all 4 tasks: export HF_TOKEN="your_hugging_face_token" python inference.py
|
|
|
|
|
|