Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
pyamy
/
llama3-dpo-pairrm
like
0
PEFT
TensorBoard
Safetensors
dpo
llama
preference-learning
License:
apache-2.0
Model card
Files
Files and versions
xet
Metrics
Training metrics
Community
Use this model
main
llama3-dpo-pairrm
241 MB
1 contributor
History:
13 commits
pyamy
Upload README.md with huggingface_hub
8f537e6
verified
6 months ago
checkpoint-100
Upload DPO PairRM fine-tuned model
6 months ago
checkpoint-150
Upload DPO PairRM fine-tuned model
6 months ago
checkpoint-200
Upload DPO PairRM fine-tuned model
6 months ago
checkpoint-250
Upload DPO PairRM fine-tuned model
6 months ago
checkpoint-50
Upload DPO PairRM fine-tuned model
6 months ago
checkpoint-500
Upload DPO PairRM fine-tuned model
6 months ago
runs
Upload DPO PairRM fine-tuned model
6 months ago
.gitattributes
1.97 kB
Upload DPO PairRM fine-tuned model
6 months ago
README.md
1.3 kB
Upload README.md with huggingface_hub
6 months ago
adapter_config.json
932 Bytes
Upload DPO PairRM fine-tuned model
6 months ago
adapter_model.safetensors
6.83 MB
xet
Upload DPO PairRM fine-tuned model
6 months ago
chat_template.jinja
3.92 kB
Upload DPO PairRM fine-tuned model
6 months ago
special_tokens_map.json
342 Bytes
Upload DPO PairRM fine-tuned model
6 months ago
tokenizer.json
17.2 MB
xet
Upload DPO PairRM fine-tuned model
6 months ago
tokenizer_config.json
52.6 kB
Upload DPO PairRM fine-tuned model
6 months ago
training_args.bin
6.26 kB
xet
Upload DPO PairRM fine-tuned model
6 months ago
training_history.json
219 Bytes
Upload DPO PairRM fine-tuned model
6 months ago
training_metrics.json
12.9 kB
Upload DPO PairRM fine-tuned model
6 months ago