Upload README.md

febd47d verified about 1 month ago

1.57 kB

license: unknown

Intro

It's RL (Reinforcement Learning) PPO (Proximal Policy Optimization) model for DOOH DSP Bidder problem. The model should respect 4 rules:

even pacing over time
desired publishers distribution (which can be different from publishers distribution in raw bid requests flow).
desired venue types distribution (which can be different from venue types distribution in raw bid requests flow).
desired household sizes distribution (which can be different from household sizes distribution in raw bid requests flow).

Requirements.txt

torch==2.10.0
matplotlib==3.10.8
ipython==8.0.0
torchrl==0.11.1
tensordict==0.11.0
numpy==2.4.2
pandas==2.3.3