Updated branches
Browse files
README.md
CHANGED
|
@@ -29,15 +29,17 @@ language:
|
|
| 29 |
> **Revisions & Branches**
|
| 30 |
>
|
| 31 |
> - **main** — *placeholder* landing branch. The canonical README lives here; model files may be minimal.
|
| 32 |
-
> - **
|
| 33 |
-
> - **W4A16
|
| 34 |
-
> - **
|
|
|
|
| 35 |
>
|
| 36 |
> 🔗 **Quick links:**
|
| 37 |
> [Browse `main`](https://huggingface.co/TheHouseOfTheDude/Behemoth-R1-123B-v2_Compressed-Tensors/tree/main) ·
|
|
|
|
| 38 |
> [Browse `W4A16`](https://huggingface.co/TheHouseOfTheDude/Behemoth-R1-123B-v2_Compressed-Tensors/tree/W4A16) ·
|
| 39 |
-
> [Browse `
|
| 40 |
-
> [Browse `
|
| 41 |
>
|
| 42 |
> *This repository hosts multiple quantizations of the finetuned parent model for vLLM using the compressed-tensors runtime format.*
|
| 43 |
|
|
|
|
| 29 |
> **Revisions & Branches**
|
| 30 |
>
|
| 31 |
> - **main** — *placeholder* landing branch. The canonical README lives here; model files may be minimal.
|
| 32 |
+
> - **NVFP4** - 4-bit weights / 4-bit activations (but acts like 16-bit activations)
|
| 33 |
+
> - **W4A16** — Symmetrical AWQ 4‑bit weights / 16‑bit activations builds and related assets are published under this revision.
|
| 34 |
+
> - **W8A16** — Symmetrical AWQ 8‑bit weights / 16‑bit activations builds and related assets are published under this revision.
|
| 35 |
+
> - **W8A8-FP8_BLOCK** — 8‑bit weights / 8‑bit activations, FP8 quality but BLOCK style, to use Cutlas on Blackwell SM12.0 (Needs latest VLLM)
|
| 36 |
>
|
| 37 |
> 🔗 **Quick links:**
|
| 38 |
> [Browse `main`](https://huggingface.co/TheHouseOfTheDude/Behemoth-R1-123B-v2_Compressed-Tensors/tree/main) ·
|
| 39 |
+
> [Browse `NFVP4`](https://huggingface.co/TheHouseOfTheDude/Behemoth-R1-123B-v2_Compressed-Tensors/tree/NVFP4) ·
|
| 40 |
> [Browse `W4A16`](https://huggingface.co/TheHouseOfTheDude/Behemoth-R1-123B-v2_Compressed-Tensors/tree/W4A16) ·
|
| 41 |
+
> [Browse `W8A16`](https://huggingface.co/TheHouseOfTheDude/Behemoth-R1-123B-v2_Compressed-Tensors/tree/W8A16) ·
|
| 42 |
+
> [Browse `W8A8-FP8_BLOCK`](https://huggingface.co/TheHouseOfTheDude/Behemoth-R1-123B-v2_Compressed-Tensors/tree/W8A8-FP8_BLOCK)
|
| 43 |
>
|
| 44 |
> *This repository hosts multiple quantizations of the finetuned parent model for vLLM using the compressed-tensors runtime format.*
|
| 45 |
|