python3 -m mlc_llm compile --quantization q4f16_1 --output 123b_r1.so . --overrides "tensor_parallel_shards=2" --device cuda python3 -m mlc_llm chat --device cuda 123b_r1/ --model-lib /workspace/123b_r1.so

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cgg507/Behemoth-R1-123B-v2-q4f16_1

Base model

mistralai/Mistral-Large-Instruct-2411

Finetuned

TheDrummer/Behemoth-R1-123B-v2

Finetuned

(1)

this model