File size: 758 Bytes
f77848f
 
 
 
 
 
 
 
 
 
 
 
4e80910
ccc33ff
4e80910
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
---
title: Retail RL Reco Explainer
emoji: 🛒
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: "4.44.0"
python_version: "3.10"
app_file: app.py
pinned: false
---

# Retail Recommendation Explainer — Before vs After RL (DPO)

This Space compares a base instruct model vs a DPO-trained LoRA adapter for
retail product recommendations with concise, constraint-aware explanations.

## Environment Variables (optional)
- `BASE_MODEL` (default: `Qwen/Qwen2.5-0.5B-Instruct`)
- `ADAPTER_RL_PATH` (default: `./adapter_dpo`)

## How to use
1. Train and produce `adapter_dpo/` using the scripts in `trainer/`.
2. Copy the `adapter_dpo/` folder into the Space repo (same folder level as `app.py`).
3. Run the Space and click **Generate (Before vs After)**.