DGPO - a omron-sinicx Collection

Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

omron-sinicx 's Collections

DGPO

updated Mar 18

Distillation-Guided Policy Optimization for Preserving Agentic RAG Capabilities

omron-sinicx/SearchR1-ppo-qwen2.5-3b-instruct

3B • Updated Mar 18 • 5 • 1
omron-sinicx/Qwen2.5-0.5B-Instruct-kd

0.5B • Updated Mar 18 • 4
omron-sinicx/Qwen2.5-0.5B-Instruct-sft

0.5B • Updated Mar 18 • 6
omron-sinicx/SearchR1-ppo-llama3.1-8b-instruct

8B • Updated Mar 18 • 5 • 1
omron-sinicx/Llama-3.2-1B-Instruct-kd

1B • Updated Mar 18 • 3
omron-sinicx/SearchR1-ppo-qwen2.5-7b-instruct

8B • Updated Mar 18 • 4 • 1
omron-sinicx/DGPO-qwen2.5-0.5b

Text Generation • 0.6B • Updated Mar 18 • 22 •

Collection guide
Browse collections

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs