·
AI & ML interests
None yet
Organizations
None yet
models
14
yungshun317/llava1.5-7b-rlaif-v-dpo
Updated
yungshun317/qwen2.5-0.5B-prm-mathshepherd
Token Classification
•
0.5B
•
Updated
•
12
yungshun317/sft-qwen2.5-7b-qlora
Text Generation
•
Updated
•
2
yungshun317/qwen2.5-32b-deberta-ultrafeedback-grpo-lora-ds
Updated
yungshun317/qwen2.5-7b-deberta-ultrafeedback-grpo-lora-ds-composite-reward
Updated
yungshun317/deberta-v3-large-format-guard-preference-distillation
0.4B
•
Updated
yungshun317/deberta-v3-large-preference-distillation
0.4B
•
Updated
yungshun317/deberta-v3-large-format-guard
0.4B
•
Updated
yungshun317/qwen2.5-7b-deberta-ultrafeedback-grpo-lora-ds
Updated
yungshun317/qwen2-0.5B-deberta-ultrafeedback-grpo
Text Generation
•
0.5B
•
Updated
•
1