Intro
It's RL (Reinforcement Learning) DQN (Deep Q-Learning) model for DOOH DSP Bidder problem. The model should respect 4 rules:
- even pacing over time
- desired publishers distribution (which can be different from publishers distribution in raw bid requests flow).
- desired venue types distribution (which can be different from venue types distribution in raw bid requests flow).
- desired household sizes distribution (which can be different from household sizes distribution in raw bid requests flow).
Requirements.txt
torch==2.10.0
matplotlib==3.10.8
ipython==8.0.0
torchrl==0.11.1
tensordict==0.11.0
numpy==2.4.2
pandas==2.3.3
Training process
Data flow
Python all-in-one files
- dsp_bidder_4_training.py - training
- dsp_bidder_4_inference.py - testing
Python more clean structured files
- p410_environment.py - PyTorch RL environment class
- p420_bid_requests.py - Campaign bid requests flow emulation. 1680 bid requests per 1 week. Number of weeks can be changed.
- p400_dsp_bidder_4_training.py - PyTorch RL DQN training code
- p440_bidder_inference.py - Inference/Testing code
- p450_functions.py - Auxiliary functions
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

