Inference Providers
Active filters: rlhf
mradermacher/ToxicHermes-2.5-Mistral-7B-GGUF
7B • Updated • 10
mradermacher/ToxicHermes-2.5-Mistral-7B-i1-GGUF
7B • Updated • 123
mradermacher/OrpoLlama-3-8B-GGUF
8B • Updated • 131
mradermacher/OrpoLlama-3-8B-i1-GGUF
8B • Updated • 211
tensorblock/Llama-3-70B-Orpo-v0.1-GGUF
hfc971/NeuralBeagle14-7B-GGUF
Updated
Reinforcement Learning
• Updated • 46
• 2
tensorblock/distilabeled-Marcoro14-7B-slerp-full-GGUF
mradermacher/distilabeled-Marcoro14-7B-slerp-full-GGUF
7B • Updated • 35
• 1
tensorblock/NeuralMarcoro14-7B-GGUF
mradermacher/distilabeled-Marcoro14-7B-slerp-full-i1-GGUF
7B • Updated • 129
• 1
mradermacher/distilabeled-Marcoro14-7B-slerp-GGUF
7B • Updated • 25
mradermacher/pandora-7b-chat-GGUF
9B • Updated • 97
mradermacher/pandora-7b-chat-i1-GGUF
9B • Updated • 235
tensorblock/NeuralHermes-2.5-Mistral-7B-GGUF
tensorblock/archangel_sft-dpo_pythia2-8b-GGUF
tensorblock/archangel_sft_llama7b-GGUF
tensorblock/archangel_sft-kto_llama13b-GGUF
mradermacher/UpshotLlama-3-8B-GGUF
8B • Updated • 80
mradermacher/Llama-3-8B-Orpo-v0.1-GGUF
8B • Updated • 90
mradermacher/Llama-3-8B-Orpo-v0.1-i1-GGUF
8B • Updated • 397
Text Generation
• Updated • 2
bikmish/llm-course-hw2-dpo
0.1B • Updated • 1
mradermacher/beaver-7b-v2.0-GGUF
Reinforcement Learning
• 7B • Updated • 189
mradermacher/beaver-7b-v3.0-GGUF
Reinforcement Learning
• 7B • Updated • 75
• 1
mradermacher/beaver-7b-v1.0-GGUF
Reinforcement Learning
• 7B • Updated • 60
loganlin777/mistral-7b-dpo-adapter
Updated
VilaVision/dentalmisalignmentdetection
Image Classification
• Updated • 2
tensorblock/mlabonne_NeuralDaredevil-7B-GGUF
BryanADA/Qwen2.5-3B-cot-zh-tw
Text Generation
• 3B • Updated • 16
• 1