Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
mikheevshow
's Collections
SIGNAL - Hiddens
NIR
DPO Experiments
ORPO/PRPO Experiments
DPO Experiments
updated
Mar 5, 2025
Upvote
-
mikheevshow/DPO-alpha-divergence-alpha_0_5_beta_0_1
Text Generation
•
0.1B
•
Updated
Mar 5, 2025
•
1
mikheevshow/DPO-forward_kl_beta_0_1
Text Generation
•
0.1B
•
Updated
Mar 5, 2025
•
1
mikheevshow/DPO-js_divergence_beta_0_1
Text Generation
•
0.1B
•
Updated
Mar 5, 2025
•
2
mikheevshow/DPO-reverse_kl_beta_0_1
Text Generation
•
0.1B
•
Updated
Mar 5, 2025
•
1
mikheevshow/DPO-reverse_kl_beta_5_0
Text Generation
•
0.1B
•
Updated
Mar 5, 2025
•
1
mikheevshow/DPO-reverse_kl_beta_1_0
Text Generation
•
0.1B
•
Updated
Mar 5, 2025
•
2
mikheevshow/DPO-reverse_kl_beta_0_05
Text Generation
•
0.1B
•
Updated
Mar 5, 2025
•
2
Upvote
-
Share collection
View history
Collection guide
Browse collections