Spaces:

sanjayvk21
/

OpenEnv-Auctioneer

Sleeping

App Files Files Community

OpenEnv-Auctioneer / README.md

Preethika

MM : Readme clean

445a7b6 about 2 months ago

preview code

raw

history blame contribute delete

15 kB

	---
	title: OpenEnv Creative Auctioneer
	colorFrom: purple
	colorTo: blue
	sdk: docker
	app_port: 7860
	tags:
	- openenv
	---

	# OpenEnv Creative Auctioneer

	A privacy-native real-time bidding (RTB) ad auction environment where an RL
	agent acts as an autonomous Account Manager — navigating a 24-hour campaign
	cycle, selecting ad creatives, pacing budgets, and assembling viral captions to
	maximise Return on Ad Spend (ROAS) — all without individual user identifiers.

	## Motivation

	Programmatic advertising is a \$500 B+ industry where split-second bidding
	decisions determine campaign success. Existing RL benchmarks either use toy
	grid-worlds or require proprietary data. OpenEnv-Auctioneer fills this gap
	with a fully open, dataset-calibrated simulation grounded in:

	\| Dataset \| Role \|
	\|---------\|------\|
	\| [MIND](https://msnews.github.io/) (Microsoft News) \| CTR calibration + headline catalog \|
	\| [iPinYou RTB](https://contest.ipinyou.com/) \| Competitor bid distributions (Lognormal/hour) \|
	\| [Vogue Dialogue](https://github.com/aimagelab/Vogue-Dialogue) \| User persona bank \|
	\| [MS-COCO Captions 2017](https://cocodataset.org/) \| Ad + caption pool for `hard_assembly` \|
	\| [Google Trends](https://github.com/GeneralMills/pytrends) / [Reddit](https://www.reddit.com/) \| Live viral hashtag scraping \|

	All datasets are optional — the environment falls back to published
	statistics so it runs out-of-the-box with zero downloads.

	---

	## Action Space

	```python
	class Action(BaseModel):
	bid_price: float # USD bid for the RTB auction (≥ 0)
	headline_id: int # Index into the 6-slot headlines catalog (0–5)
	creative_id: int # Index into the 6-slot creatives catalog (0–5)
	generated_caption: str \| None # [hard_assembly] Rewritten caption with viral hashtags
	generated_hashtags: list[str] \| None # [hard_assembly] Chosen hashtags (e.g. ["#QuietLuxury", "#OOTD"])
	```

	## Observation Space

	```python
	class Observation(BaseModel):
	hour_of_day: int # Current hour (0–23)
	remaining_budget: float # Remaining budget in USD
	spent_so_far: float # Cumulative spend
	current_context: str # "Fitness" \| "Tech" \| "Fashion" \| "Gaming"
	news_category: str # Fine-grained MIND subcategory
	viral_trend: str # Current cultural trend token
	market_pressure: float # Auction competitiveness [0, 1]
	ads_shown_this_session: int
	fatigue_level: float # User fatigue [0, 1]
	carryover_boost: float # Brand-recall CTR boost [0, 0.30]
	last_ctr: float # Previous step CTR
	cumulative_revenue: float # Total revenue earned

	# hard_assembly only:
	live_hashtags: list[str] # Real-time scraped viral hashtags
	image_description: str # Source ad image description
	base_caption: str # Base caption to rewrite
	```

	## Reward Signal

	\| Outcome \| Reward \|
	\|---------\|--------\|
	\| Auction won \| `adjusted_ctr × $15 − clearing_price` \|
	\| Auction lost \| `−$0.10` (missed opportunity) \|
	\| Over-pacing (medium only) \| `−$1.00` penalty \|
	\| Assembly bonus (hard_assembly) \| `+composite_score × $8.00` \|

	Rewards are per-step (not sparse), providing continuous gradient signal.

	---

	## Tasks

	### Level 1 — `easy_headline` (Easy)
	Objective: Select the headline with the highest CTR for each context.
	Budget: $100 \| Grader: `mean(CTR_selected / CTR_oracle)` \| Target: 0.75

	### Level 2 — `medium_pacing` (Medium)
	Objective: Pace $50 across 24 hours; retain ≥ 20% for peak hours (18–22).
	Budget: $50 \| Grader: `0.3×smoothness + 0.3×peak_survival + 0.4×revenue` \| Target: 0.70

	### Level 3 — `hard_assembly` (Hard)
	Objective: Given an ad image description + base caption + live viral hashtags,
	generate a new caption that is simultaneously viral, coherent with the image,
	and creatively novel — while also winning auctions profitably.

	Budget: $120 \| Target: 0.65

	The RL loop (what the LLM agent does each step):
	```
	1. Agent receives: image_description, base_caption, live_hashtags[], viral_trend
	2. Agent must:
	a. Select 2–4 relevant hashtags from live_hashtags (scraped from Google Trends / Reddit)
	b. Rewrite the base caption to weave those hashtags into natural ad copy
	c. Add its own creative words (target 30–50% novel vocabulary)
	d. Keep the caption coherent with the source image
	e. Set a profitable bid price
	3. Grader scores the assembled caption on 4 axes:
	• 35% — Hashtag relevance (cosine_sim of each hashtag vs viral_trend)
	• 35% — Caption-trend alignment (cosine_sim of caption vs viral_trend)
	• 20% — Caption-image coherence (cosine_sim of caption vs image_description)
	• 10% — Novelty (fraction of new words vs base_caption, target ~40%)
	4. Reward = auction_reward + composite_score × $8.00 bonus
	```

	Data sources for hard_assembly:
	- Ad creatives: MS-COCO Captions 2017 (val annotations) bucketed into Fitness/Tech/Fashion/Gaming by keyword matching. Falls back to 30-entry built-in seed pool.
	- Viral hashtags: `ViralHashtagScraper` queries Google Trends (via `pytrends`) and Reddit `/r/popular/hot.json` (public, no auth). Blends with static seed hashtags per context and trend. Cached for 1 hour.

	### Level 4 — `hard_sequencing` (Hard)
	Objective: Plan 24-hour ad placements with carry-over brand-recall boosts.
	Winning triggers +15%/+10%/+5% CTR for the next 3 hours. Cover ≥ 3 contexts for
	a 20% diversity bonus.
	Budget: $100 \| Grader: `min(1.0, agent_conv/oracle_conv × diversity_mult)` \| Target: 0.60

	---

	## Grading Details

	### `EasyHeadlineGrader`
	```
	step_score = CTR_selected / CTR_oracle
	final_score = mean(step_scores) // [0.0, 1.0]
	```

	### `MediumPacingGrader`
	```
	smoothness = 1 − mean(\|hourly_spend − ideal_spend\| / ideal_spend)
	peak_survival = 1.0 if remaining_budget ≥ 20% at hour 18, else 0.0
	revenue_factor = min(1.0, total_revenue / $30)

	final_score = 0.30 × smoothness + 0.30 × peak_survival + 0.40 × revenue_factor
	```

	### `HardAssemblyGrader` — 4-Axis Composite

	\| Axis \| Weight \| Metric \|
	\|------\|--------\|--------\|
	\| Hashtag Relevance \| 0.35 \| `mean(cosine_sim(hashtag, viral_trend))` \|
	\| Caption-Trend Alignment \| 0.35 \| `cosine_sim(caption, viral_trend)` \|
	\| Caption-Image Coherence \| 0.20 \| `cosine_sim(caption, image_description)` \|
	\| Novelty \| 0.10 \| `1 − \|novel_fraction − 0.40\| / 0.60` \|

	```
	composite = Σ (weight × axis_score)

	final_score = 0.60 × mean(composite_scores)
	+ 0.40 × min(1.0, total_revenue / $55)
	```

	### `HardSequencingGrader`
	```
	agent_conversions = Σ [CTR_t × (1 + carryover_boost_t) × $15]
	oracle_conversions = DP-optimal bid/skip sequence with carry-over

	diversity_mult = 1.20 if ≥3 distinct contexts won, else 1.0

	final_score = min(1.0, agent_conv / oracle_conv × diversity_mult)
	```

	---

	## Architecture

	```
	┌───────────────────────────────────────────────────────────┐
	│ OpenEnvAuctioneer (Gym-style environment) │
	│ │
	│ ┌──────────────────┐ ┌───────────────────────────────┐ │
	│ │ Market Engine │ │ User Simulator │ │
	│ │ (Statistical) │ │ (Semantic / LLM) │ │
	│ │ │ │ │ │
	│ │ iPinYou RTB logs │ │ SentenceTransformer │ │
	│ │ → Lognormal per │ │ all-MiniLM-L6-v2 │ │
	│ │ hour bucket │ │ + optional Llama-3-8B │ │
	│ └──────────────────┘ └───────────────────────────────┘ │
	│ │
	│ ┌───────────────────────────────────────────────────────┐ │
	│ │ MIND Dataset Layer (Microsoft News Dataset) │ │
	│ │ behaviours.tsv → CTRCalibrator │ │
	│ │ news.tsv → MINDCreativePool (headlines) │ │
	│ └───────────────────────────────────────────────────────┘ │
	│ │
	│ ┌───────────────────────────────────────────────────────┐ │
	│ │ Ad + Caption Dataset (MS-COCO Captions 2017) │ │
	│ │ → image_description + base_caption per step │ │
	│ │ → ViralHashtagScraper (pytrends + Reddit + seeds) │ │
	│ │ → agent rewrites caption with viral hashtags │ │
	│ └───────────────────────────────────────────────────────┘ │
	│ │
	│ ┌───────────────────────────────────────────────────────┐ │
	│ │ Grader (task-specific, deterministic 0.0–1.0) │ │
	│ │ Level 1: easy_headline → headline CTR lookup │ │
	│ │ Level 2: medium_pacing → pacing + survival │ │
	│ │ Level 3: hard_assembly → 4-axis composite score │ │
	│ │ Level 4: hard_sequencing→ DP oracle comparison │ │
	│ └───────────────────────────────────────────────────────┘ │
	└───────────────────────────────────────────────────────────┘
	```

	---

	## Models

	\| Model \| Role \| Always Active? \|
	\|-------\|------\|----------------\|
	\| `all-MiniLM-L6-v2` (SentenceTransformer) \| Semantic CTR scoring + grader cosine similarity \| ✅ Yes \|
	\| `Meta-Llama-3-8B-Instruct` (4-bit) \| Richer LLM-based CTR scoring \| ❌ Optional (`USE_LLM_SIMULATOR=1`) \|

	When the LLM simulator is active: `final_ctr = 0.60 × llm_ctr + 0.40 × semantic_ctr`

	---

	## Setup & Usage

	### Prerequisites
	- Python 3.10+
	- Docker (for containerised execution)

	### Local Development

	```bash
	pip install -r requirements.txt
	python -c "from environment import OpenEnvAuctioneer; e = OpenEnvAuctioneer(); print(e.reset())"
	```

	### Docker Build & Run

	```bash
	# Build the image
	docker build -t openenv-auctioneer .

	# Run the FastAPI server (default)
	docker run --rm -p 7860:7860 openenv-auctioneer

	# Run inference directly inside the container
	docker run --rm \
	-e HF_TOKEN=<your_key> \
	openenv-auctioneer python inference.py
	```

	### Inference Script

	```bash
	# Build image first, then run inference
	docker build -t openenv-auctioneer .

	LOCAL_IMAGE_NAME=openenv-auctioneer \
	HF_TOKEN=<your_key> \
	python inference.py
	```

	The inference script emits standardised `[START]`/`[STEP]`/`[END]` logs to stdout.

	### Environment Variables

	\| Variable \| Required \| Description \|
	\|----------\|----------\|-------------\|
	\| `HF_TOKEN` \| Yes (inference) \| API key for the LLM service \|
	\| `API_BASE_URL` \| No \| LLM endpoint (default: HuggingFace router) \|
	\| `MODEL_NAME` \| No \| Model identifier (default: Qwen/Qwen2.5-72B-Instruct) \|
	\| `LOCAL_IMAGE_NAME` \| Yes (inference) \| Docker image name \|
	\| `AUCTIONEER_TASK` \| No \| Task to run (default: `all`) \|
	\| `MIND_SOURCE` \| No \| `local` / `huggingface` / `azure` \|
	\| `COCO_SOURCE` \| No \| `local` / `url` (auto-download COCO annotations) \|
	\| `USE_LLM_SIMULATOR` \| No \| Set `1` to enable Llama-3 User Simulator \|

	---

	## Baseline Scores (Expected Ranges)

	\| Task \| Expected Range \| Notes \|
	\|------\|---------------\|-------\|
	\| `easy_headline` \| 0.55 – 0.80 \| Context→headline matching is learnable \|
	\| `medium_pacing` \| 0.45 – 0.70 \| Requires budget discipline \|
	\| `hard_assembly` \| 0.40 – 0.65 \| Caption quality + hashtag matching + auction wins \|
	\| `hard_sequencing` \| 0.35 – 0.60 \| Compared against DP oracle \|

	Scores depend on LLM quality and market stochasticity. Run multiple episodes
	for stable estimates.

	---

	## Project Structure

	```
	├── models.py # Pydantic models: Action, Observation, Reward, Info
	├── environment.py # OpenEnvAuctioneer + graders + dataset layers
	│ ├── MINDLoader # MIND dataset loader (HF / Azure / local)
	│ ├── MarketCalibrator # iPinYou-based auction price simulator
	│ ├── CTRCalibrator # MIND-based CTR lookup tables
	│ ├── MINDCreativePool # 6-slot headline/creative catalog from news.tsv
	│ ├── PersonaBank # Vogue Dialogue persona sampling
	│ ├── ViralHashtagScraper # Live hashtag scraping (pytrends + Reddit)
	│ ├── AdCaptionDataset # COCO-based ad image+caption pool
	│ ├── UserSimulator # Semantic + optional LLM CTR scoring
	│ ├── EasyHeadlineGrader # Level 1 grader
	│ ├── MediumPacingGrader # Level 2 grader
	│ ├── HardAssemblyGrader # Level 3 grader (4-axis composite)
	│ ├── HardSequencingGrader# Level 4 grader (DP oracle)
	│ └── OpenEnvAuctioneer # Main Gym-style env class
	├── app.py # FastAPI server (runs inside Docker)
	├── inference.py # Baseline inference script (mandatory format)
	├── openenv.yaml # OpenEnv metadata & task definitions
	├── Dockerfile # Container build
	├── requirements.txt # Python dependencies
	├── test_sequencing.py # Unit tests for DP oracle grader
	└── Datasets/ # Optional dataset mount point
	```

	## References

	1. MIND: Wu et al. (2020) — "MIND: A Large-scale Dataset for News Recommendation", ACL 2020. [msnews.github.io](https://msnews.github.io/)
	2. iPinYou RTB: Zhang et al. (2014) — "Real-Time Bidding Benchmarking with iPinYou Dataset". [contest.ipinyou.com](https://contest.ipinyou.com/)
	3. MS-COCO Captions: Lin et al. (2014) — "Microsoft COCO: Common Objects in Context". [cocodataset.org](https://cocodataset.org/)
	4. SentenceTransformers: Reimers & Gurevych (2019) — "Sentence-BERT". [sbert.net](https://www.sbert.net/)

	## License

	MIT