Spaces:

QuantumTransformer
/

AdArena

Sleeping

App Files Files Community

AdArena / README.md

QuantumTransformer

Upload folder using huggingface_hub

c24a686 verified 3 months ago

preview code

Raw

History Blame Contribute Delete

11.4 kB

	---
	title: Ad Fraud Investigation Environment
	emoji: "\U0001F575\uFE0F"
	colorFrom: red
	colorTo: yellow
	sdk: docker
	pinned: false
	app_port: 8000
	tags:
	- openenv
	- ad-fraud
	- reinforcement-learning
	base_path: /web
	---

	# Ad Fraud Investigation Environment

	An OpenEnv environment that simulates ad fraud review - a real-world task where AI agents investigate queues of advertisements, uncover fraud signals, and render verdicts under budget constraints.

	Ad fraud costs the digital advertising industry over $100 billion annually. Platforms like Meta process billions of ads daily and ban advertisers only at high confidence thresholds. Unlike simple classification, real ad review is a sequential decision-making problem: a reviewer starts with limited surface-level signals, actively chooses what to investigate within a constrained budget, and must decide when enough evidence exists to commit to a verdict. This environment captures that workflow and provides a training ground for agents to learn it.

	## Quick Start

	### Install

	```bash
	pip install -e .
	```

	### Run the server

	```bash
	uvicorn server.app:app --host 0.0.0.0 --port 8000
	```

	### Use the client

	```python
	from ad_fraud_env import AdFraudEnv, AdReviewAction

	with AdFraudEnv(base_url="http://localhost:8000").sync() as env:
	result = env.reset(seed=42, task_id="task_1")
	print(result.observation.queue_summary)

	# Investigate an ad
	result = env.step(AdReviewAction(
	action_type="investigate",
	ad_id="ad_001",
	investigation_target="landing_page",
	))
	print(result.observation.feedback)

	# Render a verdict
	result = env.step(AdReviewAction(
	action_type="verdict",
	ad_id="ad_001",
	verdict="reject",
	confidence=0.9,
	))
	print(f"Reward: {result.reward}, Done: {result.done}")
	```

	### Run with Docker

	```bash
	docker build -t ad-fraud-env .
	docker run -p 8000:8000 ad-fraud-env
	```

	## Environment Design

	### Episode flow

	Each episode is a review session. The agent receives a queue of ads and must process them within a limited action budget:

	```
	reset(task_id, seed)
	\|
	v
	+----------------------------------+<----------------------+
	\| Observe queue + first ad info \| \|
	+------------------+---------------+ \|
	\| \|
	v \|
	+-------------+ +------------------+ \|
	\| investigate \|---->\| Reveal one signal \|----------+
	+-------------+ \| (costs 1 budget) \|
	\| +------------------+
	v
	+-------------+ +------------------+
	\| verdict \|---->\| approve / reject \|----------+
	+-------------+ \| / escalate \| \|
	\| +------------------+ \|
	v \|
	+--------------+ +------------------+ \|
	\| link_accounts\|---->\| Flag fraud ring \|----------+
	+--------------+ \| (Task 3 only) \|
	\| +------------------+
	v
	Budget exhausted or all ads reviewed -> episode ends
	```

	### Tasks

	Three tasks with increasing difficulty test different capabilities:

	\| Task \| Name \| Ads \| Budget \| Composition \| Challenge \|
	\|---\|---\|---:\|---:\|---\|---\|
	\| 1 \| Basic Ad Triage \| 5 \| 25 \| 2 legit, 3 obvious fraud \| Learn the investigate -> verdict loop \|
	\| 2 \| Sophisticated Fraud \| 12 \| 30 \| 5 legit, 5 sophisticated scams, 2 gray-area \| Triage under budget pressure (~2.5 actions/ad) \|
	\| 3 \| Fraud Network Detection \| 20 \| 35 \| 6 legit, 10 fraud (3 hidden rings), 4 gray-area \| Cross-ad reasoning to detect coordinated networks (~1.75 actions/ad) \|

	Task 3 introduces fraud rings - clusters of 3-5 ads controlled by the same actor, using varied topologies (cliques, chains, hub-and-spoke). Individual ring members look borderline; the fraud signal is only visible by cross-referencing investigation data across ads (shared payment IDs, matching template hashes, overlapping targeting fingerprints).

	### Action Space

	Actions are JSON objects. Three types:

	`investigate` - spend one budget point to reveal a signal about an ad.

	```json
	{
	"action_type": "investigate",
	"ad_id": "ad_001",
	"investigation_target": "landing_page"
	}
	```

	Each ad has six investigation dimensions:

	\| Target \| What it reveals \|
	\|---\|---\|
	\| `advertiser_history` \| Account age, spend history, violation record, verification status \|
	\| `landing_page` \| Domain age, SSL, registrar, redirect chains, scam template similarity \|
	\| `payment_method` \| Payment type, chargeback history, cross-account velocity \|
	\| `targeting_overlap` \| Targeting fingerprint, audience overlap percentages \|
	\| `creative_similarity` \| Template hash, image dimensions, scam template similarity score \|
	\| `campaign_structure` \| Objective, bid strategy, budget/age ratio, placement distribution \|

	`verdict` - render a final decision on an ad.

	```json
	{
	"action_type": "verdict",
	"ad_id": "ad_001",
	"verdict": "reject",
	"confidence": 0.9
	}
	```

	`verdict` options: `approve`, `reject`, `escalate`. `confidence`: 0.0-1.0.

	`link_accounts` - flag two ads as part of the same fraud network (Task 3).

	```json
	{
	"action_type": "link_accounts",
	"ad_id": "ad_003",
	"linked_ad_id": "ad_007",
	"link_reason": "shared payment ID pmt_ring_48231 and matching template hash"
	}
	```

	### Observation Space

	Observations are text-heavy by design so LLM agents can reason naturally:

	\| Field \| Type \| Description \|
	\|---\|---\|---\|
	\| `queue_summary` \| `str` \| Task name, total/reviewed/pending counts, budget remaining \|
	\| `current_ad_info` \| `str` \| Ad copy, category, targeting, risk signals for the focused ad \|
	\| `investigation_findings` \| `str` \| Accumulated findings from all investigations so far \|
	\| `verdict_history_summary` \| `str` \| Verdicts rendered so far \|
	\| `feedback` \| `str` \| Natural language feedback on the last action \|
	\| `available_ads` \| `list[str]` \| Ad IDs still pending review \|
	\| `queue_status` \| `dict` \| Structured status for programmatic access \|
	\| `done` \| `bool` \| Whether the episode is complete \|
	\| `reward` \| `float` \| Step reward \|

	## Reward Design

	\| Action \| Reward \| Rationale \|
	\|---\|---:\|---\|
	\| Investigation \| -0.02 \| Simulates time/latency cost \|
	\| Correct rejection (fraud -> reject) \| +0.30 to +0.40 \| Scaled by fraud severity \|
	\| Correct approval (legit -> approve) \| +0.10 \| Revenue preserved \|
	\| Correct escalation \| +0.15 \| Appropriate caution \|
	\| False positive (legit -> reject) \| -0.35 \| Lost advertiser revenue \|
	\| False negative (fraud -> approve) \| -0.50 \| Worst outcome - fraud goes live \|
	\| Escalate (when wrong) \| -0.05 \| Human reviewer cost \|
	\| Correct network link \| +0.40 \| High-value coordinated fraud detection \|
	\| Incorrect network link \| -0.25 \| False accusation cost \|

	Unreviewed ads are auto-approved at episode end - missed fraud incurs the full -0.50 false-negative penalty.

	## Grading & Scoring

	Each task has a dedicated grader that produces a normalized 0.0-1.0 score. Raw reward is normalized between theoretical worst-case (every decision wrong + full budget wasted) and best-case (every decision correct + efficient budget use).

	\| Component \| Task 1 \| Task 2 \| Task 3 \|
	\|---\|:---:\|:---:\|:---:\|
	\| Verdict accuracy \| Yes \| Yes \| Yes \|
	\| Budget efficiency bonus \| Yes \| Yes \| Yes \|
	\| Calibration bonus \| - \| Yes \| Yes \|
	\| Network detection (edge coverage) \| - \| - \| Yes \|
	\| Investigation coverage bonus \| - \| - \| Yes \|

	Calibration bonus rewards agents whose stated confidence correlates with actual accuracy - high confidence on correct verdicts and low confidence on uncertain ones.

	Network detection uses edge coverage: what fraction of ground-truth fraud ring connections did the agent discover via `link_accounts`?

	Coverage bonus rewards breadth over depth - agents that review more ads (rather than deep-diving a single one) score higher on Task 3.

	## Baseline Scores

	Generated with `seed=42` using `meta-llama/Llama-3.1-8B-Instruct`. Reproducible via `python inference.py`.

	\| Task \| Score \| Steps \| Verdicts \|
	\|---\|---:\|---:\|---:\|
	\| Task 1 (Easy) \| 0.953 \| 10 \| 5/5 \|
	\| Task 2 (Medium) \| 0.882 \| 23 \| 12/12 \|
	\| Task 3 (Hard) \| 0.415 \| 35 \| 20/20 \|

	The sharp drop on Task 3 reflects the difficulty of cross-ad reasoning under tight budget - the baseline agent investigates and renders verdicts well but struggles to detect coordinated fraud rings.

	## Project Structure

	```
	ad_fraud_env/
	+-- __init__.py # Package exports
	+-- client.py # WebSocket client (extends EnvClient)
	+-- models.py # Action, Observation, State types
	+-- inference.py # Baseline LLM agent with mandatory stdout logging
	+-- openenv.yaml # OpenEnv manifest
	+-- pyproject.toml # Dependencies and package config
	+-- Dockerfile # Multi-stage Docker build
	+-- baseline_scores.json # Cached baseline results
	+-- data/
	\| +-- ad_generator.py # Episode generation, task configs, campaign profiles
	\| +-- advertiser_profiles.py # Synthetic advertiser history
	\| +-- fraud_patterns.py # Fraud + legit ad templates (easy/medium/hard)
	\| +-- landing_pages.py # Simulated landing page investigation data
	\| +-- network_generator.py # Fraud ring topologies via networkx
	+-- graders/
	\| +-- base_grader.py # Shared normalization and reward logic
	\| +-- task1_grader.py # Verdict accuracy only
	\| +-- task2_grader.py # + calibration bonus
	\| +-- task3_grader.py # + network detection + coverage bonus
	+-- server/
	\| +-- app.py # FastAPI app with /tasks, /baseline, /grader endpoints
	\| +-- environment.py # Core environment (reset/step/state)
	\| +-- investigate_ui.py # HTML dashboard routes (/investigate, /web redirect)
	\| +-- static/
	\| +-- investigate_hq.html # Interactive investigation dashboard
	\| +-- requirements.txt # Server dependencies
	\| +-- investigate_ui.py # HTML dashboard routes (/investigate, /web redirect)
	\| +-- static/
	\| +-- investigate_hq.html # Interactive investigation dashboard
	+-- tests/
	+-- test_data_generation.py # Determinism, cross-ref checks, decoy validation
	+-- test_environment.py # Step logic, state tracking, anti-exploit
	+-- test_graders.py # Score ranges, calibration, network scoring
	```

	## API Endpoints

	\| Endpoint \| Method \| Description \|
	\|---\|---\|---\|
	\| `/health` \| GET \| Health check \|
	\| `/schema` \| GET \| Action/Observation JSON schemas \|
	\| `/ws` \| WS \| WebSocket for `step()` / `reset()` / `state()` \|
	\| `/tasks` \| GET \| Task list with configs and action schema \|
	\| `/baseline` \| GET \| Baseline scores (cached or live) \|
	\| `/grader` \| GET \| Last episode's grader result \|
	\| `/investigate` \| GET \| HTML investigation dashboard (also `/` redirects here) \|

	## License

	BSD 3-Clause License