---
title: AdRL Studio
colorFrom: purple
colorTo: blue
sdk: docker
app_port: 7860
pinned: false
---
π― AdRL Studio
[](https://www.python.org/)
[](https://flask.palletsprojects.com/)
[](https://www.docker.com/)
[](https://huggingface.co/mnoorchenar/spaces)
[](#)
**π― AdRL Studio** β A contextual multi-armed bandit platform that simulates a real-world ad recommendation and serving system using reinforcement learning. Benchmarks four bandit algorithms side by side, visualizes online learning and regret curves, runs A/B test simulations with statistical significance testing, and serves real-time ad recommendations from user context input.
---
## Table of Contents
- [Features](#-features)
- [Architecture](#οΈ-architecture)
- [Getting Started](#-getting-started)
- [Docker Deployment](#-docker-deployment)
- [Dashboard Modules](#-dashboard-modules)
- [ML Models](#-ml-models)
- [Project Structure](#-project-structure)
- [Author](#-author)
- [Contributing](#-contributing)
- [Disclaimer](#disclaimer)
- [License](#-license)
---
## β¨ Features
| π― Live Ad Serving |
Enter user context (age, device, time, category, region) and get real-time ad recommendations from all 4 algorithms simultaneously |
| βΆ Online Learning Simulation |
Run 1Kβ10K impression simulations with SSE-streamed progress, rolling CTR charts, and per-algorithm summaries |
| π Regret Analysis |
Visualize cumulative regret curves β the canonical RL evaluation metric β comparing all four policies |
| β A/B Test Simulator |
Run 50/50 traffic splits with two-proportion z-test, p-value, confidence intervals, and statistical significance verdict |
| π Secure by Design |
Role-based access, audit logs, encrypted data pipelines |
| π³ Containerized Deployment |
Docker-first architecture, cloud-ready and scalable |
---
## ποΈ Architecture
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AdRL Studio β
β β
β βββββββββββββ βββββββββββββ βββββββββββββββββ β
β β SimulatedβββββΆβ Bandit βββββΆβ Flask API β β
β β Ad Environβ β Algorithmsβ β Backend β β
β βββββββββββββ βββββββββββββ βββββββββ¬ββββββββ β
β β β
β ββββββββββΌβββββββββ β
β β Plotly Charts β β
β β Dashboard β β
β βββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
---
## π Getting Started
### Prerequisites
- Python 3.10+
- Docker & Docker Compose
- Git
### Local Installation
```bash
# 1. Clone the repository
git clone https://github.com/mnoorchenar/AdRL-Studio.git
cd AdRL-Studio
# 2. Create a virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Configure environment variables
cp .env.example .env
# Edit .env with your settings
# 5. Run the application
python app.py
```
Open your browser at `http://localhost:7860` π
---
## π³ Docker Deployment
```bash
# Build and run with Docker Compose
docker compose up --build
# Or pull and run the pre-built image
docker pull mnoorchenar/AdRL-Studio
docker run -p 7860:7860 mnoorchenar/AdRL-Studio
```
---
## π Dashboard Modules
| Module | Description | Status |
|--------|-------------|--------|
| π― Live Ad Serving | Real-time 4-algorithm recommendation from user context | β
Live |
| βΆ Online Learning | Simulation with SSE streaming and rolling CTR charts | β
Live |
| π Regret Analysis | Cumulative regret curves for all four algorithms | β
Live |
| β A/B Test Simulator | Statistical significance testing with z-test & CI | β
Live |
| π‘ Reward Landscape | 5Γ5 CTR heatmap: user content category Γ ad category | β
Live |
| π¬ Policy Inspector | Per-ad learned weights and posterior distributions | ποΈ Planned |
---
## π§ ML Models
```python
# Core Models Used in AdRL Studio
models = {
"epsilon_greedy": "Ξ΅-Greedy Neural Bandit β shared PyTorch MLP (39β32β16β1) with decaying Ξ΅",
"ucb1": "UCB1 β Upper Confidence Bound non-contextual baseline",
"thompson": "Thompson Sampling β Bayesian Beta(Ξ±,Ξ²) per arm",
"linucb": "LinUCB Disjoint β ridge regression contextual bandit (production-grade)",
"environment": "Simulated 20-ad inventory, 19-dim one-hot context, Bernoulli reward sampling"
}
```
---
## π Project Structure
```
AdRL-Studio/
β
βββ π app.py # Complete Flask application β all logic, templates, and API
βββ π Dockerfile # Container definition (python:3.10-slim, port 7860)
βββ π requirements.txt # Python dependencies
βββ π README.md # This file
```
> All application logic, HTML templates, CSS, and JavaScript live inside `app.py`
> using Flask's `render_template_string`. There are no external static files.
---
## π¨βπ» Author