Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
SirajRLX
/
task2file
like
0
Transformers
Safetensors
Generated from Trainer
trl
dpo
arxiv:
2305.18290
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
task2file
/
DPO-14b
39.9 MB
Ctrl+K
Ctrl+K
1 contributor
History:
1 commit
SirajRLX
Upload folder using huggingface_hub
7be9bb6
verified
3 months ago
README.md
Safe
5.22 kB
Upload folder using huggingface_hub
3 months ago
apply_critical_fixes.py
Safe
6.78 kB
Upload folder using huggingface_hub
3 months ago
config_dpo.yaml
Safe
3.65 kB
Upload folder using huggingface_hub
3 months ago
create_synthetic_pairs.py
Safe
5.1 kB
Upload folder using huggingface_hub
3 months ago
dpo_dataset.jsonl
Safe
5.67 kB
Upload folder using huggingface_hub
3 months ago
dpo_pairs_generated.jsonl
Safe
39.8 MB
xet
Upload folder using huggingface_hub
3 months ago
f1_score_utils.py
Safe
9.36 kB
Upload folder using huggingface_hub
3 months ago
prepare_data.py
Safe
12 kB
Upload folder using huggingface_hub
3 months ago
requirements.txt
Safe
397 Bytes
Upload folder using huggingface_hub
3 months ago
run_dpo.py
Safe
32.2 kB
Upload folder using huggingface_hub
3 months ago
run_dpo.py.backup
Safe
31.4 kB
Upload folder using huggingface_hub
3 months ago
run_dpo_enhanced.py
Safe
9.54 kB
Upload folder using huggingface_hub
3 months ago
test_fixes.py
Safe
3.7 kB
Upload folder using huggingface_hub
3 months ago