python3 -m mlc_llm compile --quantization q4f16_1 --output 123b_r1.so . --overrides "tensor_parallel_shards=2" --device cuda python3 -m mlc_llm chat --device cuda 123b_r1/ --model-lib /workspace/123b_r1.so

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for cgg507/Behemoth-R1-123B-v2-q4f16_1

Finetuned
(1)
this model