| ---
|
| license: unknown
|
| ---
|
|
|
| # Intro
|
|
|
| It's RL (Reinforcement Learning) PPO (Proximal Policy Optimization) model for DOOH DSP Bidder problem.
|
| The model should respect 4 rules:
|
| - even pacing over time
|
| - desired publishers distribution (which can be different from publishers distribution in raw bid requests flow).
|
| - desired venue types distribution (which can be different from venue types distribution in raw bid requests flow).
|
| - desired household sizes distribution (which can be different from household sizes distribution in raw bid requests flow).
|
|
|
| # Requirements.txt
|
|
|
| ```
|
| torch==2.10.0
|
| matplotlib==3.10.8
|
| ipython==8.0.0
|
| torchrl==0.11.1
|
| tensordict==0.11.0
|
| numpy==2.4.2
|
| pandas==2.3.3
|
| ```
|
|
|
| # Training process
|
|
|
| 
|
|
|
| # Data flow
|
|
|
| 
|
|
|
| # Python all-in-one files
|
|
|
| - [dsp_bidder_4_training.py](https://huggingface.co/StanislavKo28/AdTech_DSP_Bidder___RL_PPO_4_rules/blob/main/p561_dsp_bidder_4_ppo_training.py) - training
|
| - [dsp_bidder_4_environment.py](https://huggingface.co/StanislavKo28/AdTech_DSP_Bidder___RL_PPO_4_rules/blob/main/ppoenv/p561_dsp_bidder_4_ppo_environment_003_pt_distr_GOOD.py) - environment
|
| - [dsp_bidder_4_inference.py](https://huggingface.co/StanislavKo28/AdTech_DSP_Bidder___RL_PPO_4_rules/blob/main/p591_dsp_bidder_4_ppo_inference.py) - inference
|
| - [dsp_bidder_4_inference_once.py](https://huggingface.co/StanislavKo28/AdTech_DSP_Bidder___RL_PPO_4_rules/blob/main/p592_dsp_bidder_4_ppo_inference_once.py) - inference single bid request
|
|
|
|
|