fix: use correct HF router URL /hf-inference/v1/ (not /models/{id}/v1/) 9a9473a ademarteau commited on 2 days ago
fix: COGS profit model - charge unit cost on sold units not on orders, preventing end-of-period bias against high-inventory strategies 18aac4f ademarteau commited on 2 days ago
fix: update HF proxy URL to router.huggingface.co (api-inference deprecated 410) 38aa642 ademarteau commited on 2 days ago
fix: serve index.html with no-cache headers to prevent CDN/browser stale cache 2cee429 ademarteau commited on 2 days ago
fix: proxy HF Inference API through FastAPI to bypass HF Spaces CSP c3fc8d4 ademarteau commited on 2 days ago
metrics: profit first, then service level, then fill rate (React UI) e7f1f53 ademarteau commited on 2 days ago
fix: use minimal requirements-server.txt in Docker to avoid downloading torch/ML deps b2065cc ademarteau commited on 2 days ago
feat: replace Gradio with React UI — GRPO tab, 730-day sim, 200-entry memory bank b413222 ademarteau commited on 2 days ago
refactor: remove Unsloth, use standard transformers + PEFT 355b2d5 RishbhaJain Claude Sonnet 4.6 commited on 2 days ago
fix: use nvidia/cuda devel base image so vllm can build with CUDA toolkit b52921e Arvind Sreenivas commited on 2 days ago
fix: pin torch 2.6.0 + xformers 0.0.29, use Python 3.12 for ML compat d73e520 Arvind Sreenivas commited on 2 days ago
fix: install torch, xformers, vllm before requirements to avoid build failures ea9e7b6 Arvind Sreenivas commited on 2 days ago
fix: install torch before vllm/xformers to satisfy build deps 32d0699 Arvind Sreenivas commited on 2 days ago
fix: force-reinstall unsloth to fix PreTrainedConfig NameError 477d0a5 Arvind Sreenivas commited on 2 days ago
fix: align Unsloth config with recommended GRPO settings d1c6fd5 RishbhaJain Claude Sonnet 4.6 commited on 2 days ago
Merge branch 'main' of https://github.com/ademcodesproducts/OpenEnv-Inventory-Simulations 84565ee ademarteau commited on 2 days ago
fix: pipeline-aware ordering, YoY demand signal, reward rebalancing c10dcd0 RishbhaJain Claude Sonnet 4.6 commited on 2 days ago
feat: integrate Unsloth into GRPO training pipeline 4d42a14 RishbhaJain Claude Sonnet 4.6 commited on 2 days ago
feat: full-horizon lookahead reward (365 days, <0.5ms) af5c3c7 Arvind Sreenivas commited on 3 days ago
feat: crash-resilient training with dataset caching and iteration resume 9ebd26d Arvind Sreenivas commited on 3 days ago
feat: add Northflank training Dockerfile and start.sh c0ce96d Arvind Sreenivas commited on 3 days ago
feat: improve GRPO training logging and fix torch_dtype deprecation 7dea3a9 Arvind Sreenivas commited on 3 days ago
fix: let Gradio auto-select port locally, fix via env vars only 6d9b0d9 ademarteau commited on 3 days ago
Merge branch 'main' of https://huggingface.co/spaces/ademarteau/RL-Inventory-Simulations 7f56785 ademarteau commited on 3 days ago
Merge teammate changes, unify reward via reward.py, add PPO model 043e4e9 ademarteau commited on 3 days ago
feat: improve training logging with tqdm, timings, GPU memory, ETA 766dc8c Arvind Sreenivas commited on 3 days ago
Merge branch 'main' of https://github.com/ademcodesproducts/OpenEnv-Inventory-Simulations 920573d ademarteau commited on 3 days ago
fix: add missing ML and simulation packages to requirements.txt 2344156 Arvind Sreenivas commited on 3 days ago
fix: remove pywin32 Windows-only packages, use Python 3.13 1091939 Arvind Sreenivas commited on 3 days ago
fix: bump to Python 3.13 to match requirements.txt (audioop-lts) 5482efa Arvind Sreenivas commited on 3 days ago