Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
quablab
/
smollm3-dpo-aligned
like
0
Text Generation
Transformers
Safetensors
smollm3
Generated from Trainer
trl
dpo
hf_jobs
conversational
arxiv:
2305.18290
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
smollm3-dpo-aligned
Commit History
Training in progress, step 1000
8570d57
verified
quablab
commited on
Oct 13, 2025
Training in progress, step 750
96f31c0
verified
quablab
commited on
Oct 13, 2025
Training in progress, step 500
08c4bf0
verified
quablab
commited on
Oct 13, 2025
Training in progress, step 250
0807ac1
verified
quablab
commited on
Oct 13, 2025
Training in progress, step 1000
8e867aa
verified
quablab
commited on
Oct 9, 2025
Training in progress, step 750
30e2a71
verified
quablab
commited on
Oct 9, 2025
Training in progress, step 500
db27840
verified
quablab
commited on
Oct 9, 2025
Training in progress, step 250
330e356
verified
quablab
commited on
Oct 9, 2025
Training in progress, step 1000
d300e94
verified
quablab
commited on
Oct 9, 2025
Training in progress, step 750
5f9c069
verified
quablab
commited on
Oct 9, 2025
Training in progress, step 500
6d79423
verified
quablab
commited on
Oct 9, 2025
Training in progress, step 250
bbab7fa
verified
quablab
commited on
Oct 9, 2025
initial commit
63b2a82
verified
quablab
commited on
Oct 9, 2025