Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
rokugatsu
/
LLM2025_Advanced_6_DPO5
like
0
Text Generation
Safetensors
u-10bei/sft_alfworld_trajectory_dataset_v4
u-10bei/dbbench_sft_dataset_react_v3
English
trl
qwen3
dpo
agent
tool-use
alfworld
conversational
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
main
LLM2025_Advanced_6_DPO5
/
vocab.json
rokugatsu
Upload DPO-trained Qwen3-4B-Instruct-2507 model
a84f0d9
verified
about 2 months ago
raw
Copy download link
history
contribute
delete
Safe
2.78 MB
File too large to display, you can
check the raw version
instead.