Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

451

Full-text search

Active filters: rlhf

PKU-Alignment/beaver-7b-v3.0-reward

Reinforcement Learning • 7B • Updated Apr 20, 2024 • 7

PKU-Alignment/beaver-7b-v3.0-cost

Reinforcement Learning • 13B • Updated Apr 20, 2024 • 17

PKU-Alignment/beaver-7b-unified-reward

Reinforcement Learning • 7B • Updated Apr 20, 2024 • 557

PKU-Alignment/beaver-7b-unified-cost

Reinforcement Learning • 7B • Updated Apr 20, 2024 • 461 • 2

Aditya685/UpshotLlama-3-8B

Text Generation • 8B • Updated Apr 20, 2024 • 1

bartowski/OrpoLlama-3-8B-GGUF

Text Generation • 8B • Updated Apr 20, 2024 • 167 • 4

QuantFactory/NeuralDaredevil-7B-GGUF

Text Generation • 7B • Updated May 24, 2024 • 442

LoneStriker/OrpoLlama-3-8B-GGUF

8B • Updated Apr 21, 2024 • 16 • 1

LoneStriker/OrpoLlama-3-8B-3.0bpw-h6-exl2

Text Generation • Updated Apr 21, 2024 • 1

LoneStriker/OrpoLlama-3-8B-4.0bpw-h6-exl2

Text Generation • Updated Apr 21, 2024 • 1

LoneStriker/OrpoLlama-3-8B-5.0bpw-h6-exl2

Text Generation • Updated Apr 21, 2024 • 1

LoneStriker/OrpoLlama-3-8B-6.0bpw-h6-exl2

Text Generation • Updated Apr 21, 2024 • 1

LoneStriker/OrpoLlama-3-8B-8.0bpw-h8-exl2

Text Generation • Updated Apr 21, 2024 • 1

jalaganapathy/jalaModelRepo

Text Generation • 7B • Updated Apr 21, 2024 • 1

mlx-community/OrpoLlama-3-8B-4bit

Text Generation • Updated Apr 21, 2024 • 2

mlx-community/OrpoLlama-3-8B-8bit

Text Generation • Updated Apr 21, 2024 • 5

bartowski/OrpoLlama-3-8B-exl2

Text Generation • Updated Apr 21, 2024 • 2 • 1

hus960/OrpoLlama-3-8B-Q4_K_M-GGUF

8B • Updated Apr 23, 2024 • 17

QuantFactory/OrpoLlama-3-8B-GGUF

Text Generation • 8B • Updated Apr 24, 2024 • 169

dfurman/Llama-3-8B-Orpo-v0.1

Text Generation • 8B • Updated Sep 17, 2024 • 7.45k • 1

dfurman/Llama-3-70B-Orpo-v0.1

Text Generation • 71B • Updated Sep 6, 2024 • 14 • 2

newsletter/CapybaraHermes-2.5-Mistral-7B-Q6_K-GGUF

7B • Updated Aug 17, 2024 • 3 • 1

mradermacher/archangel_sft-kto_llama30b-GGUF

33B • Updated May 31, 2024 • 145 • 1

mradermacher/archangel_sft-kto_llama30b-i1-GGUF

33B • Updated Aug 2, 2024 • 108

line-corporation/sacpo

Reinforcement Learning • 7B • Updated Jun 21, 2024 • 30 • 5

nvidia/Llama3-70B-PPO-Chat

Updated Jun 14, 2024 • 6

line-corporation/p-sacpo

Reinforcement Learning • 7B • Updated Jun 21, 2024 • 13 • 3

dfurman/Qwen2-72B-Orpo-v0.1

Text Generation • 73B • Updated Sep 26, 2024 • 17 • 4

mradermacher/Qwen2-72B-Orpo-v0.1-GGUF

73B • Updated Jul 22, 2024 • 76

mradermacher/Qwen2-72B-Orpo-v0.1-i1-GGUF

73B • Updated Aug 2, 2024 • 95