Safetensors
llama
File size: 471 Bytes
387c841
e72855c
 
 
 
387c841
e72855c
387c841
e72855c
387c841
e72855c
387c841
e72855c
1
2
3
4
5
6
7
8
9
10
11
12
13
---
datasets:
- weqweasdas/ultra_train
base_model:
- OpenRLHF/Llama-3-8b-sft-mixture
---
Base Model: [OpenRLHF/Llama-3-8b-sft-mixture](https://huggingface.co/OpenRLHF/Llama-3-8b-sft-mixture)

DPO model: [RTO-RL/Llama3-8B-DPO](https://huggingface.co/RTO-RL/Llama3-8B-DPO)

Reward model: [RTO-RL/Llama3.2-1B-RewardModel](https://huggingface.co/RTO-RL/Llama3.2-1B-RewardModel)

Prompt dataset: [weqweasdas/ultra_train](https://huggingface.co/datasets/weqweasdas/ultra_train)