Spaces:

sanjayvk21
/

OpenEnv-Auctioneer

Sleeping

App Files Files Community

OpenEnv-Auctioneer / README.md

Preethika

MM : Readme clean

445a7b6 about 2 months ago

preview code

raw

history blame contribute delete

15 kB

metadata

title: OpenEnv Creative Auctioneer
colorFrom: purple
colorTo: blue
sdk: docker
app_port: 7860
tags:
  - openenv

OpenEnv Creative Auctioneer

A privacy-native real-time bidding (RTB) ad auction environment where an RL agent acts as an autonomous Account Manager — navigating a 24-hour campaign cycle, selecting ad creatives, pacing budgets, and assembling viral captions to maximise Return on Ad Spend (ROAS) — all without individual user identifiers.

Motivation

Programmatic advertising is a $500 B+ industry where split-second bidding decisions determine campaign success. Existing RL benchmarks either use toy grid-worlds or require proprietary data. OpenEnv-Auctioneer fills this gap with a fully open, dataset-calibrated simulation grounded in:

Dataset	Role
MIND (Microsoft News)	CTR calibration + headline catalog
iPinYou RTB	Competitor bid distributions (Lognormal/hour)
Vogue Dialogue	User persona bank
MS-COCO Captions 2017	Ad + caption pool for `hard_assembly`
Google Trends / Reddit	Live viral hashtag scraping

All datasets are optional — the environment falls back to published statistics so it runs out-of-the-box with zero downloads.

Action Space

class Action(BaseModel):
    bid_price: float          # USD bid for the RTB auction (≥ 0)
    headline_id: int          # Index into the 6-slot headlines catalog (0–5)
    creative_id: int          # Index into the 6-slot creatives catalog (0–5)
    generated_caption: str | None    # [hard_assembly] Rewritten caption with viral hashtags
    generated_hashtags: list[str] | None  # [hard_assembly] Chosen hashtags (e.g. ["#QuietLuxury", "#OOTD"])

Observation Space

class Observation(BaseModel):
    hour_of_day: int          # Current hour (0–23)
    remaining_budget: float   # Remaining budget in USD
    spent_so_far: float       # Cumulative spend
    current_context: str      # "Fitness" | "Tech" | "Fashion" | "Gaming"
    news_category: str        # Fine-grained MIND subcategory
    viral_trend: str          # Current cultural trend token
    market_pressure: float    # Auction competitiveness [0, 1]
    ads_shown_this_session: int
    fatigue_level: float      # User fatigue [0, 1]
    carryover_boost: float    # Brand-recall CTR boost [0, 0.30]
    last_ctr: float           # Previous step CTR
    cumulative_revenue: float # Total revenue earned

    # hard_assembly only:
    live_hashtags: list[str]      # Real-time scraped viral hashtags
    image_description: str        # Source ad image description
    base_caption: str             # Base caption to rewrite

Reward Signal

Outcome	Reward
Auction won	`adjusted_ctr × $15 − clearing_price`
Auction lost	`−$0.10` (missed opportunity)
Over-pacing (medium only)	`−$1.00` penalty
Assembly bonus (hard_assembly)	`+composite_score × $8.00`

Rewards are per-step (not sparse), providing continuous gradient signal.

Tasks

Level 1 — `easy_headline` (Easy)

Objective: Select the headline with the highest CTR for each context. Budget: $100 | Grader: mean(CTR_selected / CTR_oracle) | Target: 0.75

Level 2 — `medium_pacing` (Medium)

Objective: Pace $50 across 24 hours; retain ≥ 20% for peak hours (18–22). Budget: $50 | Grader: 0.3×smoothness + 0.3×peak_survival + 0.4×revenue | Target: 0.70

Level 3 — `hard_assembly` (Hard)

Objective: Given an ad image description + base caption + live viral hashtags, generate a new caption that is simultaneously viral, coherent with the image, and creatively novel — while also winning auctions profitably.

Budget: $120 | Target: 0.65

The RL loop (what the LLM agent does each step):

1. Agent receives: image_description, base_caption, live_hashtags[], viral_trend
2. Agent must:
   a. Select 2–4 relevant hashtags from live_hashtags (scraped from Google Trends / Reddit)
   b. Rewrite the base caption to weave those hashtags into natural ad copy
   c. Add its own creative words (target 30–50% novel vocabulary)
   d. Keep the caption coherent with the source image
   e. Set a profitable bid price
3. Grader scores the assembled caption on 4 axes:
   • 35% — Hashtag relevance  (cosine_sim of each hashtag vs viral_trend)
   • 35% — Caption-trend alignment  (cosine_sim of caption vs viral_trend)
   • 20% — Caption-image coherence  (cosine_sim of caption vs image_description)
   • 10% — Novelty  (fraction of new words vs base_caption, target ~40%)
4. Reward = auction_reward + composite_score × $8.00 bonus

Data sources for hard_assembly:

Ad creatives: MS-COCO Captions 2017 (val annotations) bucketed into Fitness/Tech/Fashion/Gaming by keyword matching. Falls back to 30-entry built-in seed pool.
Viral hashtags: ViralHashtagScraper queries Google Trends (via pytrends) and Reddit /r/popular/hot.json (public, no auth). Blends with static seed hashtags per context and trend. Cached for 1 hour.

Level 4 — `hard_sequencing` (Hard)

Objective: Plan 24-hour ad placements with carry-over brand-recall boosts. Winning triggers +15%/+10%/+5% CTR for the next 3 hours. Cover ≥ 3 contexts for a 20% diversity bonus. Budget: $100 | Grader: min(1.0, agent_conv/oracle_conv × diversity_mult) | Target: 0.60

Grading Details

`EasyHeadlineGrader`

step_score  = CTR_selected / CTR_oracle
final_score = mean(step_scores)                         // [0.0, 1.0]

`MediumPacingGrader`

smoothness     = 1 − mean(|hourly_spend − ideal_spend| / ideal_spend)
peak_survival  = 1.0 if remaining_budget ≥ 20% at hour 18, else 0.0
revenue_factor = min(1.0, total_revenue / $30)

final_score = 0.30 × smoothness + 0.30 × peak_survival + 0.40 × revenue_factor

`HardAssemblyGrader` — 4-Axis Composite

Axis	Weight	Metric
Hashtag Relevance	0.35	`mean(cosine_sim(hashtag, viral_trend))`
Caption-Trend Alignment	0.35	`cosine_sim(caption, viral_trend)`
Caption-Image Coherence	0.20	`cosine_sim(caption, image_description)`
Novelty	0.10	`1 −

composite = Σ (weight × axis_score)

final_score = 0.60 × mean(composite_scores)
            + 0.40 × min(1.0, total_revenue / $55)

`HardSequencingGrader`

agent_conversions  = Σ [CTR_t × (1 + carryover_boost_t) × $15]
oracle_conversions = DP-optimal bid/skip sequence with carry-over

diversity_mult = 1.20 if ≥3 distinct contexts won, else 1.0

final_score = min(1.0, agent_conv / oracle_conv × diversity_mult)

Architecture

┌───────────────────────────────────────────────────────────┐
│  OpenEnvAuctioneer (Gym-style environment)                │
│                                                           │
│  ┌──────────────────┐   ┌───────────────────────────────┐ │
│  │  Market Engine    │   │   User Simulator              │ │
│  │  (Statistical)    │   │   (Semantic / LLM)            │ │
│  │                   │   │                               │ │
│  │  iPinYou RTB logs │   │  SentenceTransformer          │ │
│  │  → Lognormal per  │   │  all-MiniLM-L6-v2            │ │
│  │    hour bucket    │   │  + optional Llama-3-8B        │ │
│  └──────────────────┘   └───────────────────────────────┘ │
│                                                           │
│  ┌───────────────────────────────────────────────────────┐ │
│  │  MIND Dataset Layer  (Microsoft News Dataset)         │ │
│  │  behaviours.tsv  →  CTRCalibrator                     │ │
│  │  news.tsv        →  MINDCreativePool (headlines)      │ │
│  └───────────────────────────────────────────────────────┘ │
│                                                           │
│  ┌───────────────────────────────────────────────────────┐ │
│  │  Ad + Caption Dataset  (MS-COCO Captions 2017)        │ │
│  │  → image_description + base_caption per step          │ │
│  │  → ViralHashtagScraper (pytrends + Reddit + seeds)    │ │
│  │  → agent rewrites caption with viral hashtags         │ │
│  └───────────────────────────────────────────────────────┘ │
│                                                           │
│  ┌───────────────────────────────────────────────────────┐ │
│  │  Grader (task-specific, deterministic 0.0–1.0)        │ │
│  │   Level 1: easy_headline  → headline CTR lookup       │ │
│  │   Level 2: medium_pacing  → pacing + survival         │ │
│  │   Level 3: hard_assembly  → 4-axis composite score    │ │
│  │   Level 4: hard_sequencing→ DP oracle comparison      │ │
│  └───────────────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────┘

Models

Model	Role	Always Active?
`all-MiniLM-L6-v2` (SentenceTransformer)	Semantic CTR scoring + grader cosine similarity	✅ Yes
`Meta-Llama-3-8B-Instruct` (4-bit)	Richer LLM-based CTR scoring	❌ Optional (`USE_LLM_SIMULATOR=1`)

When the LLM simulator is active: final_ctr = 0.60 × llm_ctr + 0.40 × semantic_ctr

Setup & Usage

Prerequisites

Python 3.10+
Docker (for containerised execution)

Local Development

pip install -r requirements.txt
python -c "from environment import OpenEnvAuctioneer; e = OpenEnvAuctioneer(); print(e.reset())"

Docker Build & Run

# Build the image
docker build -t openenv-auctioneer .

# Run the FastAPI server (default)
docker run --rm -p 7860:7860 openenv-auctioneer

# Run inference directly inside the container
docker run --rm \
  -e HF_TOKEN=<your_key> \
  openenv-auctioneer python inference.py

Inference Script

# Build image first, then run inference
docker build -t openenv-auctioneer .

LOCAL_IMAGE_NAME=openenv-auctioneer \
HF_TOKEN=<your_key> \
python inference.py

The inference script emits standardised [START]/[STEP]/[END] logs to stdout.

Environment Variables

Variable	Required	Description
`HF_TOKEN`	Yes (inference)	API key for the LLM service
`API_BASE_URL`	No	LLM endpoint (default: HuggingFace router)
`MODEL_NAME`	No	Model identifier (default: Qwen/Qwen2.5-72B-Instruct)
`LOCAL_IMAGE_NAME`	Yes (inference)	Docker image name
`AUCTIONEER_TASK`	No	Task to run (default: `all`)
`MIND_SOURCE`	No	`local` / `huggingface` / `azure`
`COCO_SOURCE`	No	`local` / `url` (auto-download COCO annotations)
`USE_LLM_SIMULATOR`	No	Set `1` to enable Llama-3 User Simulator

Baseline Scores (Expected Ranges)

Task	Expected Range	Notes
`easy_headline`	0.55 – 0.80	Context→headline matching is learnable
`medium_pacing`	0.45 – 0.70	Requires budget discipline
`hard_assembly`	0.40 – 0.65	Caption quality + hashtag matching + auction wins
`hard_sequencing`	0.35 – 0.60	Compared against DP oracle

Scores depend on LLM quality and market stochasticity. Run multiple episodes for stable estimates.

Project Structure

├── models.py          # Pydantic models: Action, Observation, Reward, Info
├── environment.py     # OpenEnvAuctioneer + graders + dataset layers
│   ├── MINDLoader          # MIND dataset loader (HF / Azure / local)
│   ├── MarketCalibrator    # iPinYou-based auction price simulator
│   ├── CTRCalibrator       # MIND-based CTR lookup tables
│   ├── MINDCreativePool    # 6-slot headline/creative catalog from news.tsv
│   ├── PersonaBank         # Vogue Dialogue persona sampling
│   ├── ViralHashtagScraper # Live hashtag scraping (pytrends + Reddit)
│   ├── AdCaptionDataset    # COCO-based ad image+caption pool
│   ├── UserSimulator       # Semantic + optional LLM CTR scoring
│   ├── EasyHeadlineGrader  # Level 1 grader
│   ├── MediumPacingGrader  # Level 2 grader
│   ├── HardAssemblyGrader  # Level 3 grader (4-axis composite)
│   ├── HardSequencingGrader# Level 4 grader (DP oracle)
│   └── OpenEnvAuctioneer   # Main Gym-style env class
├── app.py             # FastAPI server (runs inside Docker)
├── inference.py       # Baseline inference script (mandatory format)
├── openenv.yaml       # OpenEnv metadata & task definitions
├── Dockerfile         # Container build
├── requirements.txt   # Python dependencies
├── test_sequencing.py # Unit tests for DP oracle grader
└── Datasets/          # Optional dataset mount point

References

MIND: Wu et al. (2020) — "MIND: A Large-scale Dataset for News Recommendation", ACL 2020. msnews.github.io
iPinYou RTB: Zhang et al. (2014) — "Real-Time Bidding Benchmarking with iPinYou Dataset". contest.ipinyou.com
MS-COCO Captions: Lin et al. (2014) — "Microsoft COCO: Common Objects in Context". cocodataset.org
SentenceTransformers: Reimers & Gurevych (2019) — "Sentence-BERT". sbert.net

License

MIT