Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
underactuated
/
mistral_dpo_test2
like
0
Transformers
Safetensors
Generated from Trainer
trl
dpo
arxiv:
2305.18290
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
mistral_dpo_test2
Commit History
End of training
23f9484
verified
underactuated
commited on
Mar 2, 2025
End of training
2ac5233
verified
underactuated
commited on
Mar 2, 2025
End of training
994c11b
verified
underactuated
commited on
Mar 2, 2025
End of training
e734060
verified
underactuated
commited on
Mar 2, 2025
End of training
6987963
verified
underactuated
commited on
Mar 2, 2025
End of training
9e8ab18
verified
underactuated
commited on
Mar 2, 2025
End of training
81a2f8d
verified
underactuated
commited on
Mar 2, 2025
End of training
6003be5
verified
underactuated
commited on
Mar 2, 2025
End of training
bd3a06d
verified
underactuated
commited on
Mar 2, 2025
End of training
9754c8a
verified
underactuated
commited on
Mar 2, 2025
initial commit
befbfef
verified
underactuated
commited on
Mar 2, 2025